Chapter 4: Generators - Generators + Promises - 《You Don't Know JS: Async & Performance（1st edition）》

Generators + Promises
- Promise-Aware Generator Runner
  - ES7: async and await?
- Promise Concurrency in Generators
  - Promises, Hidden

Generators + Promises

In our previous discussion, we showed how generators can be iterated asynchronously, which is a huge step forward in sequential reason-ability over the spaghetti mess of callbacks. But we lost something very important: the trustability and composability of Promises (see Chapter 3)!

Don’t worry — we can get that back. The best of all worlds in ES6 is to combine generators (synchronous-looking async code) with Promises (trustable and composable).

But how?

Recall from Chapter 3 the Promise-based approach to our running Ajax example:

function foo(x,y) {
    return request(
        "http://some.url.1/?x=" + x + "&y=" + y
    );
}
foo( 11, 31 )
.then(
    function(text){
        console.log( text );
    },
    function(err){
        console.error( err );
    }
);

In our earlier generator code for the running Ajax example, foo(..) returned nothing (undefined), and our iterator control code didn’t care about that yielded value.

But here the Promise-aware foo(..) returns a promise after making the Ajax call. That suggests that we could construct a promise with foo(..) and then yield it from the generator, and then the iterator control code would receive that promise.

But what should the iterator do with the promise?

It should listen for the promise to resolve (fulfillment or rejection), and then either resume the generator with the fulfillment message or throw an error into the generator with the rejection reason.

Let me repeat that, because it’s so important. The natural way to get the most out of Promises and generators is to yield a Promise, and wire that Promise to control the generator’s iterator.

Let’s give it a try! First, we’ll put the Promise-aware foo(..) together with the generator *main():

function foo(x,y) {
    return request(
        "http://some.url.1/?x=" + x + "&y=" + y
    );
}
function *main() {
    try {
        var text = yield foo( 11, 31 );
        console.log( text );
    }
    catch (err) {
        console.error( err );
    }
}

The most powerful revelation in this refactor is that the code inside *main() did not have to change at all! Inside the generator, whatever values are yielded out is just an opaque implementation detail, so we’re not even aware it’s happening, nor do we need to worry about it.

But how are we going to run *main() now? We still have some of the implementation plumbing work to do, to receive and wire up the yielded promise so that it resumes the generator upon resolution. We’ll start by trying that manually:

var it = main();
var p = it.next().value;
// wait for the `p` promise to resolve
p.then(
    function(text){
        it.next( text );
    },
    function(err){
        it.throw( err );
    }
);

Actually, that wasn’t so painful at all, was it?

This snippet should look very similar to what we did earlier with the manually wired generator controlled by the error-first callback. Instead of an if (err) { it.throw.., the promise already splits fulfillment (success) and rejection (failure) for us, but otherwise the iterator control is identical.

Now, we’ve glossed over some important details.

Most importantly, we took advantage of the fact that we knew that *main() only had one Promise-aware step in it. What if we wanted to be able to Promise-drive a generator no matter how many steps it has? We certainly don’t want to manually write out the Promise chain differently for each generator! What would be much nicer is if there was a way to repeat (aka “loop” over) the iteration control, and each time a Promise comes out, wait on its resolution before continuing.

Also, what if the generator throws out an error (intentionally or accidentally) during the it.next(..) call? Should we quit, or should we catch it and send it right back in? Similarly, what if we it.throw(..) a Promise rejection into the generator, but it’s not handled, and comes right back out?

Promise-Aware Generator Runner

The more you start to explore this path, the more you realize, “wow, it’d be great if there was just some utility to do it for me.” And you’re absolutely correct. This is such an important pattern, and you don’t want to get it wrong (or exhaust yourself repeating it over and over), so your best bet is to use a utility that is specifically designed to run Promise-yielding generators in the manner we’ve illustrated.

Several Promise abstraction libraries provide just such a utility, including my asynquence library and its runner(..), which will be discussed in Appendix A of this book.

But for the sake of learning and illustration, let’s just define our own standalone utility that we’ll call run(..):

// thanks to Benjamin Gruenbaum (@benjamingr on GitHub) for
// big improvements here!
function run(gen) {
    var args = [].slice.call( arguments, 1), it;
    // initialize the generator in the current context
    it = gen.apply( this, args );
    // return a promise for the generator completing
    return Promise.resolve()
        .then( function handleNext(value){
            // run to the next yielded value
            var next = it.next( value );
            return (function handleResult(next){
                // generator has completed running?
                if (next.done) {
                    return next.value;
                }
                // otherwise keep going
                else {
                    return Promise.resolve( next.value )
                        .then(
                            // resume the async loop on
                            // success, sending the resolved
                            // value back into the generator
                            handleNext,
                            // if `value` is a rejected
                            // promise, propagate error back
                            // into the generator for its own
                            // error handling
                            function handleErr(err) {
                                return Promise.resolve(
                                    it.throw( err )
                                )
                                .then( handleResult );
                            }
                        );
                }
            })(next);
        } );
}

As you can see, it’s a quite a bit more complex than you’d probably want to author yourself, and you especially wouldn’t want to repeat this code for each generator you use. So, a utility/library helper is definitely the way to go. Nevertheless, I encourage you to spend a few minutes studying that code listing to get a better sense of how to manage the generator+Promise negotiation.

How would you use run(..) with *main() in our running Ajax example?

function *main() {
    // ..
}
run( main );

That’s it! The way we wired run(..), it will automatically advance the generator you pass to it, asynchronously until completion.

Note: The run(..) we defined returns a promise which is wired to resolve once the generator is complete, or receive an uncaught exception if the generator doesn’t handle it. We don’t show that capability here, but we’ll come back to it later in the chapter.

ES7: `async` and `await`?

The preceding pattern — generators yielding Promises that then control the generator’s iterator to advance it to completion — is such a powerful and useful approach, it would be nicer if we could do it without the clutter of the library utility helper (aka run(..)).

There’s probably good news on that front. At the time of this writing, there’s early but strong support for a proposal for more syntactic addition in this realm for the post-ES6, ES7-ish timeframe. Obviously, it’s too early to guarantee the details, but there’s a pretty decent chance it will shake out similar to the following:

function foo(x,y) {
    return request(
        "http://some.url.1/?x=" + x + "&y=" + y
    );
}
async function main() {
    try {
        var text = await foo( 11, 31 );
        console.log( text );
    }
    catch (err) {
        console.error( err );
    }
}
main();

As you can see, there’s no run(..) call (meaning no need for a library utility!) to invoke and drive main() — it’s just called as a normal function. Also, main() isn’t declared as a generator function anymore; it’s a new kind of function: async function. And finally, instead of yielding a Promise, we await for it to resolve.

The async function automatically knows what to do if you await a Promise — it will pause the function (just like with generators) until the Promise resolves. We didn’t illustrate it in this snippet, but calling an async function like main() automatically returns a promise that’s resolved whenever the function finishes completely.

Tip: The async / await syntax should look very familiar to readers with experience in C#, because it’s basically identical.

The proposal essentially codifies support for the pattern we’ve already derived, into a syntactic mechanism: combining Promises with sync-looking flow control code. That’s the best of both worlds combined, to effectively address practically all of the major concerns we outlined with callbacks.

The mere fact that such a ES7-ish proposal already exists and has early support and enthusiasm is a major vote of confidence in the future importance of this async pattern.

Promise Concurrency in Generators

So far, all we’ve demonstrated is a single-step async flow with Promises+generators. But real-world code will often have many async steps.

If you’re not careful, the sync-looking style of generators may lull you into complacency with how you structure your async concurrency, leading to suboptimal performance patterns. So we want to spend a little time exploring the options.

Imagine a scenario where you need to fetch data from two different sources, then combine those responses to make a third request, and finally print out the last response. We explored a similar scenario with Promises in Chapter 3, but let’s reconsider it in the context of generators.

Your first instinct might be something like:

function *foo() {
    var r1 = yield request( "http://some.url.1" );
    var r2 = yield request( "http://some.url.2" );
    var r3 = yield request(
        "http://some.url.3/?v=" + r1 + "," + r2
    );
    console.log( r3 );
}
// use previously defined `run(..)` utility
run( foo );

This code will work, but in the specifics of our scenario, it’s not optimal. Can you spot why?

Because the r1 and r2 requests can — and for performance reasons, should — run concurrently, but in this code they will run sequentially; the "http://some.url.2" URL isn’t Ajax fetched until after the "http://some.url.1" request is finished. These two requests are independent, so the better performance approach would likely be to have them run at the same time.

But how exactly would you do that with a generator and yield? We know that yield is only a single pause point in the code, so you can’t really do two pauses at the same time.

The most natural and effective answer is to base the async flow on Promises, specifically on their capability to manage state in a time-independent fashion (see “Future Value” in Chapter 3).

The simplest approach:

function *foo() {
    // make both requests "in parallel"
    var p1 = request( "http://some.url.1" );
    var p2 = request( "http://some.url.2" );
    // wait until both promises resolve
    var r1 = yield p1;
    var r2 = yield p2;
    var r3 = yield request(
        "http://some.url.3/?v=" + r1 + "," + r2
    );
    console.log( r3 );
}
// use previously defined `run(..)` utility
run( foo );

Why is this different from the previous snippet? Look at where the yield is and is not. p1 and p2 are promises for Ajax requests made concurrently (aka “in parallel”). It doesn’t matter which one finishes first, because promises will hold onto their resolved state for as long as necessary.

Then we use two subsequent yield statements to wait for and retrieve the resolutions from the promises (into r1 and r2, respectively). If p1 resolves first, the yield p1 resumes first then waits on the yield p2 to resume. If p2 resolves first, it will just patiently hold onto that resolution value until asked, but the yield p1 will hold on first, until p1 resolves.

Either way, both p1 and p2 will run concurrently, and both have to finish, in either order, before the r3 = yield request.. Ajax request will be made.

If that flow control processing model sounds familiar, it’s basically the same as what we identified in Chapter 3 as the “gate” pattern, enabled by the Promise.all([ .. ]) utility. So, we could also express the flow control like this:

function *foo() {
    // make both requests "in parallel," and
    // wait until both promises resolve
    var results = yield Promise.all( [
        request( "http://some.url.1" ),
        request( "http://some.url.2" )
    ] );
    var r1 = results[0];
    var r2 = results[1];
    var r3 = yield request(
        "http://some.url.3/?v=" + r1 + "," + r2
    );
    console.log( r3 );
}
// use previously defined `run(..)` utility
run( foo );

Note: As we discussed in Chapter 3, we can even use ES6 destructuring assignment to simplify the var r1 = .. var r2 = .. assignments, with var [r1,r2] = results.

In other words, all of the concurrency capabilities of Promises are available to us in the generator+Promise approach. So in any place where you need more than sequential this-then-that async flow control steps, Promises are likely your best bet.

Promises, Hidden

As a word of stylistic caution, be careful about how much Promise logic you include inside your generators. The whole point of using generators for asynchrony in the way we’ve described is to create simple, sequential, sync-looking code, and to hide as much of the details of asynchrony away from that code as possible.

For example, this might be a cleaner approach:

// note: normal function, not generator
function bar(url1,url2) {
    return Promise.all( [
        request( url1 ),
        request( url2 )
    ] );
}
function *foo() {
    // hide the Promise-based concurrency details
    // inside `bar(..)`
    var results = yield bar(
        "http://some.url.1",
        "http://some.url.2"
    );
    var r1 = results[0];
    var r2 = results[1];
    var r3 = yield request(
        "http://some.url.3/?v=" + r1 + "," + r2
    );
    console.log( r3 );
}
// use previously defined `run(..)` utility
run( foo );

Inside *foo(), it’s cleaner and clearer that all we’re doing is just asking bar(..) to get us some results, and we’ll yield-wait on that to happen. We don’t have to care that under the covers a Promise.all([ .. ]) Promise composition will be used to make that happen.

We treat asynchrony, and indeed Promises, as an implementation detail.

Hiding your Promise logic inside a function that you merely call from your generator is especially useful if you’re going to do a sophisticated series flow-control. For example:

function bar() {
    return    Promise.all( [
          baz( .. )
          .then( .. ),
          Promise.race( [ .. ] )
        ] )
        .then( .. )
}

That kind of logic is sometimes required, and if you dump it directly inside your generator(s), you’ve defeated most of the reason why you would want to use generators in the first place. We should intentionally abstract such details away from our generator code so that they don’t clutter up the higher level task expression.

Beyond creating code that is both functional and performant, you should also strive to make code that is as reason-able and maintainable as possible.

Note: Abstraction is not always a healthy thing for programming — many times it can increase complexity in exchange for terseness. But in this case, I believe it’s much healthier for your generator+Promise async code than the alternatives. As with all such advice, though, pay attention to your specific situations and make proper decisions for you and your team.

Generators + Promises

Generators + Promises

Promise-Aware Generator Runner

ES7: async and await?

Promise Concurrency in Generators

Promises, Hidden

ES7: `async` and `await`?