Trust Issues

The mismatch between sequential brain planning and callback-driven async JS code is only part of the problem with callbacks. There’s something much deeper to be concerned about.

Let’s once again revisit the notion of a callback function as the continuation (aka the second half) of our program:

  1. // A
  2. ajax( "..", function(..){
  3. // C
  4. } );
  5. // B

// A and // B happen now, under the direct control of the main JS program. But // C gets deferred to happen later, and under the control of another party — in this case, the ajax(..) function. In a basic sense, that sort of hand-off of control doesn’t regularly cause lots of problems for programs.

But don’t be fooled by its infrequency that this control switch isn’t a big deal. In fact, it’s one of the worst (and yet most subtle) problems about callback-driven design. It revolves around the idea that sometimes ajax(..) (i.e., the “party” you hand your callback continuation to) is not a function that you wrote, or that you directly control. Many times, it’s a utility provided by some third party.

We call this “inversion of control,” when you take part of your program and give over control of its execution to another third party. There’s an unspoken “contract” that exists between your code and the third-party utility — a set of things you expect to be maintained.

Tale of Five Callbacks

It might not be terribly obvious why this is such a big deal. Let me construct an exaggerated scenario to illustrate the hazards of trust at play.

Imagine you’re a developer tasked with building out an ecommerce checkout system for a site that sells expensive TVs. You already have all the various pages of the checkout system built out just fine. On the last page, when the user clicks “confirm” to buy the TV, you need to call a third-party function (provided say by some analytics tracking company) so that the sale can be tracked.

You notice that they’ve provided what looks like an async tracking utility, probably for the sake of performance best practices, which means you need to pass in a callback function. In this continuation that you pass in, you will have the final code that charges the customer’s credit card and displays the thank you page.

This code might look like:

  1. analytics.trackPurchase( purchaseData, function(){
  2. chargeCreditCard();
  3. displayThankyouPage();
  4. } );

Easy enough, right? You write the code, test it, everything works, and you deploy to production. Everyone’s happy!

Six months go by and no issues. You’ve almost forgotten you even wrote that code. One morning, you’re at a coffee shop before work, casually enjoying your latte, when you get a panicked call from your boss insisting you drop the coffee and rush into work right away.

When you arrive, you find out that a high-profile customer has had his credit card charged five times for the same TV, and he’s understandably upset. Customer service has already issued an apology and processed a refund. But your boss demands to know how this could possibly have happened. “Don’t we have tests for stuff like this!?”

You don’t even remember the code you wrote. But you dig back in and start trying to find out what could have gone awry.

After digging through some logs, you come to the conclusion that the only explanation is that the analytics utility somehow, for some reason, called your callback five times instead of once. Nothing in their documentation mentions anything about this.

Frustrated, you contact customer support, who of course is as astonished as you are. They agree to escalate it to their developers, and promise to get back to you. The next day, you receive a lengthy email explaining what they found, which you promptly forward to your boss.

Apparently, the developers at the analytics company had been working on some experimental code that, under certain conditions, would retry the provided callback once per second, for five seconds, before failing with a timeout. They had never intended to push that into production, but somehow they did, and they’re totally embarrassed and apologetic. They go into plenty of detail about how they’ve identified the breakdown and what they’ll do to ensure it never happens again. Yadda, yadda.

What’s next?

You talk it over with your boss, but he’s not feeling particularly comfortable with the state of things. He insists, and you reluctantly agree, that you can’t trust them anymore (that’s what bit you), and that you’ll need to figure out how to protect the checkout code from such a vulnerability again.

After some tinkering, you implement some simple ad hoc code like the following, which the team seems happy with:

  1. var tracked = false;
  2. analytics.trackPurchase( purchaseData, function(){
  3. if (!tracked) {
  4. tracked = true;
  5. chargeCreditCard();
  6. displayThankyouPage();
  7. }
  8. } );

Note: This should look familiar to you from Chapter 1, because we’re essentially creating a latch to handle if there happen to be multiple concurrent invocations of our callback.

But then one of your QA engineers asks, “what happens if they never call the callback?” Oops. Neither of you had thought about that.

You begin to chase down the rabbit hole, and think of all the possible things that could go wrong with them calling your callback. Here’s roughly the list you come up with of ways the analytics utility could misbehave:

  • Call the callback too early (before it’s been tracked)
  • Call the callback too late (or never)
  • Call the callback too few or too many times (like the problem you encountered!)
  • Fail to pass along any necessary environment/parameters to your callback
  • Swallow any errors/exceptions that may happen

That should feel like a troubling list, because it is. You’re probably slowly starting to realize that you’re going to have to invent an awful lot of ad hoc logic in each and every single callback that’s passed to a utility you’re not positive you can trust.

Now you realize a bit more completely just how hellish “callback hell” is.

Not Just Others’ Code

Some of you may be skeptical at this point whether this is as big a deal as I’m making it out to be. Perhaps you don’t interact with truly third-party utilities much if at all. Perhaps you use versioned APIs or self-host such libraries, so that its behavior can’t be changed out from underneath you.

So, contemplate this: can you even really trust utilities that you do theoretically control (in your own code base)?

Think of it this way: most of us agree that at least to some extent we should build our own internal functions with some defensive checks on the input parameters, to reduce/prevent unexpected issues.

Overly trusting of input:

  1. function addNumbers(x,y) {
  2. // + is overloaded with coercion to also be
  3. // string concatenation, so this operation
  4. // isn't strictly safe depending on what's
  5. // passed in.
  6. return x + y;
  7. }
  8. addNumbers( 21, 21 ); // 42
  9. addNumbers( 21, "21" ); // "2121"

Defensive against untrusted input:

  1. function addNumbers(x,y) {
  2. // ensure numerical input
  3. if (typeof x != "number" || typeof y != "number") {
  4. throw Error( "Bad parameters" );
  5. }
  6. // if we get here, + will safely do numeric addition
  7. return x + y;
  8. }
  9. addNumbers( 21, 21 ); // 42
  10. addNumbers( 21, "21" ); // Error: "Bad parameters"

Or perhaps still safe but friendlier:

  1. function addNumbers(x,y) {
  2. // ensure numerical input
  3. x = Number( x );
  4. y = Number( y );
  5. // + will safely do numeric addition
  6. return x + y;
  7. }
  8. addNumbers( 21, 21 ); // 42
  9. addNumbers( 21, "21" ); // 42

However you go about it, these sorts of checks/normalizations are fairly common on function inputs, even with code we theoretically entirely trust. In a crude sort of way, it’s like the programming equivalent of the geopolitical principle of “Trust But Verify.”

So, doesn’t it stand to reason that we should do the same thing about composition of async function callbacks, not just with truly external code but even with code we know is generally “under our own control”? Of course we should.

But callbacks don’t really offer anything to assist us. We have to construct all that machinery ourselves, and it often ends up being a lot of boilerplate/overhead that we repeat for every single async callback.

The most troublesome problem with callbacks is inversion of control leading to a complete breakdown along all those trust lines.

If you have code that uses callbacks, especially but not exclusively with third-party utilities, and you’re not already applying some sort of mitigation logic for all these inversion of control trust issues, your code has bugs in it right now even though they may not have bitten you yet. Latent bugs are still bugs.

Hell indeed.