back to index

Flattening Arrays: From Recursion to Generators

Three takes on the array flattening problem: recursion, hand-rolled iterators, generators (yield*).

published Mar 14, 2019 tags #javascript #generators #iterators

~/posts/array-flatten-recursion-to-generators $ cat post.md

/ LANG EN / 中文
/ THEME / /

The problem

Array.prototype.flat is available now. It isn’t novel runtime machinery — the spec just made the implementation standard. The problem itself: an array whose elements are either T or T[], and the T[] can be nested arbitrarily —

type SomeArray<T> = Array<T | SomeArray<T>>;

The type above won’t typecheck — TS doesn’t allow that kind of recursive type.

The job is to flatten the whole structure down to a plain T[].

Recursion

The most direct solution:

const flat = <T>(arr: SomeArray<T>): T[] => {
    return arr.reduce((previous: T[], current: T | SomeArray<T>) => {
        if (!Array.isArray(current)) {
            return previous.concat(current);
        }
        return previous.concat(flat(current));
    }, [] as T[]);
};

SomeArray is used here just as a notation. The TS type isn’t legal, but the runtime logic is fine.

Nothing clever: if the current element is an array, recurse; if not, concat it. The shape — reduce + concat — sets up the next step: iteration.

Iterables

Arrays in TS/ES implement Iterable<T>. That sits on top of the iteration protocols: any value that implements the protocol can be consumed by for...of or by [...value]. The protocol asks for a next() method that returns { done, value }.

This pairs naturally with flattening — make next() recurse into sub-arrays.

The doc example, roughly:

var myIterator = {
    next: function () {
        // ...
    },
    [Symbol.iterator]: function () {
        return this;
    },
};

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Iteration_protocols

A simplified iterator-based flat:

const flat = (arr) => {
    const rest = [...arr]; // clone
    const iterator = {
        [Symbol.iterator]: () => ({
            next: () => {
                if (rest.length === 0) {
                    return { done: true };
                }
                const current = rest.shift();
                if (Array.isArray(current)) {
                    rest.unshift(...current);
                } else {
                    rest.unshift(current);
                }
                return { value: rest.shift(), done: false };
            },
        }),
    };
    return [...iterator];
};

It’s uglier than the recursive version. TS also pushes you off it, insisting on class extends Iterator<T>. There’s a better answer: generators.

Generators

Generator syntax is function*. With yield you can describe an infinite stream of integers:

function* infiniteNumber(): IterableIterator<number> {
    let index: number = 0;
    while (1) {
        yield index++;
    }
}
const iterator: IterableIterator<number> = infiniteNumber();

The mental model: the function pauses at yield; the next next() resumes; when the function body ends, the generator returns done: true.

Delegating to another generator

yield* hands the next several next() calls off to another iterator:

function* gen() {
    yield* ["a", "b", "c"];
}

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Iterators_and_Generators

At the yield* line, the next three iterations come from the array’s iterator. With this you can express recursion-shaped logic in a generator.

Generator-based flat

The shape isn’t really different from the recursive version, but it’s short. When consuming, remember to destructure properly — careless spread on a generator can loop forever, so clone the input into a closure first.

function* flat<T>(arr: Array<T | T[]>): IterableIterator<T> {
    const rest: Array<T | T[]> = [...arr]; // clone
    while (rest.length > 0) {
        const current: T | T[] = rest.shift();
        if (Array.isArray(current)) {
            yield* flat(current);
        } else {
            yield current;
        }
    }
}

console.log([...flat([1, [2, [4, 5]], 3])]);

The generator + yield* combination ends up the cleanest expression of the three.

back to index