Chapter 5: Program Performance - How to Optimize with asm.js - 《You Don't Know JS: Async & Performance（1st edition）》

asm.js
- How to Optimize with asm.js
- asm.js Modules

asm.js

“asm.js” (http://asmjs.org/) is a label for a highly optimizable subset of the JavaScript language. By carefully avoiding certain mechanisms and patterns that are hard to optimize (garbage collection, coercion, etc.), asm.js-styled code can be recognized by the JS engine and given special attention with aggressive low-level optimizations.

Distinct from other program performance mechanisms discussed in this chapter, asm.js isn’t necessarily something that needs to be adopted into the JS language specification. There is an asm.js specification (http://asmjs.org/spec/latest/), but it’s mostly for tracking an agreed upon set of candidate inferences for optimization rather than a set of requirements of JS engines.

There’s not currently any new syntax being proposed. Instead, asm.js suggests ways to recognize existing standard JS syntax that conforms to the rules of asm.js and let engines implement their own optimizations accordingly.

There’s been some disagreement between browser vendors over exactly how asm.js should be activated in a program. Early versions of the asm.js experiment required a "use asm"; pragma (similar to strict mode’s "use strict";) to help clue the JS engine to be looking for asm.js optimization opportunities and hints. Others have asserted that asm.js should just be a set of heuristics that engines automatically recognize without the author having to do anything extra, meaning that existing programs could theoretically benefit from asm.js-style optimizations without doing anything special.

How to Optimize with asm.js

The first thing to understand about asm.js optimizations is around types and coercion (see the Types & Grammar title of this series). If the JS engine has to track multiple different types of values in a variable through various operations, so that it can handle coercions between types as necessary, that’s a lot of extra work that keeps the program optimization suboptimal.

Note: We’re going to use asm.js-style code here for illustration purposes, but be aware that it’s not commonly expected that you’ll author such code by hand. asm.js is more intended to a compilation target from other tools, such as Emscripten (https://github.com/kripken/emscripten/wiki). It’s of course possible to write your own asm.js code, but that’s usually a bad idea because the code is very low level and managing it can be very time consuming and error prone. Nevertheless, there may be cases where you’d want to hand tweak your code for asm.js optimization purposes.

There are some “tricks” you can use to hint to an asm.js-aware JS engine what the intended type is for variables/operations, so that it can skip these coercion tracking steps.

For example:

var a = 42;
// ..
var b = a;

In that program, the b = a assignment leaves the door open for type divergence in variables. However, it could instead be written as:

var a = 42;
// ..
var b = a | 0;

Here, we’ve used the | (“binary OR”) with value 0, which has no effect on the value other than to make sure it’s a 32-bit integer. That code run in a normal JS engine works just fine, but when run in an asm.js-aware JS engine it can signal that b should always be treated as a 32-bit integer, so the coercion tracking can be skipped.

Similarly, the addition operation between two variables can be restricted to a more performant integer addition (instead of floating point):

(a + b) | 0

Again, the asm.js-aware JS engine can see that hint and infer that the + operation should be 32-bit integer addition because the end result of the whole expression would automatically be 32-bit integer conformed anyway.

asm.js Modules

One of the biggest detractors to performance in JS is around memory allocation, garbage collection, and scope access. asm.js suggests one of the ways around these issues is to declare a more formalized asm.js “module” — do not confuse these with ES6 modules; see the ES6 & Beyond title of this series.

For an asm.js module, you need to explicitly pass in a tightly conformed namespace — this is referred to in the spec as stdlib, as it should represent standard libraries needed — to import necessary symbols, rather than just using globals via lexical scope. In the base case, the window object is an acceptable stdlib object for asm.js module purposes, but you could and perhaps should construct an even more restricted one.

You also must declare a “heap” — which is just a fancy term for a reserved spot in memory where variables can already be used without asking for more memory or releasing previously used memory — and pass that in, so that the asm.js module won’t need to do anything that would cause memory churn; it can just use the pre-reserved space.

A “heap” is likely a typed ArrayBuffer, such as:

var heap = new ArrayBuffer( 0x10000 );    // 64k heap

Using that pre-reserved 64k of binary space, an asm.js module can store and retrieve values in that buffer without any memory allocation or garbage collection penalties. For example, the heap buffer could be used inside the module to back an array of 64-bit float values like this:

var arr = new Float64Array( heap );

OK, so let’s make a quick, silly example of an asm.js-styled module to illustrate how these pieces fit together. We’ll define a foo(..) that takes a start (x) and end (y) integer for a range, and calculates all the inner adjacent multiplications of the values in the range, and then finally averages those values together:

function fooASM(stdlib,foreign,heap) {
    "use asm";
    var arr = new stdlib.Int32Array( heap );
    function foo(x,y) {
        x = x | 0;
        y = y | 0;
        var i = 0;
        var p = 0;
        var sum = 0;
        var count = ((y|0) - (x|0)) | 0;
        // calculate all the inner adjacent multiplications
        for (i = x | 0;
            (i | 0) < (y | 0);
            p = (p + 8) | 0, i = (i + 1) | 0
        ) {
            // store result
            arr[ p >> 3 ] = (i * (i + 1)) | 0;
        }
        // calculate average of all intermediate values
        for (i = 0, p = 0;
            (i | 0) < (count | 0);
            p = (p + 8) | 0, i = (i + 1) | 0
        ) {
            sum = (sum + arr[ p >> 3 ]) | 0;
        }
        return +(sum / count);
    }
    return {
        foo: foo
    };
}
var heap = new ArrayBuffer( 0x1000 );
var foo = fooASM( window, null, heap ).foo;
foo( 10, 20 );        // 233

Note: This asm.js example is hand authored for illustration purposes, so it doesn’t represent the same code that would be produced from a compilation tool targeting asm.js. But it does show the typical nature of asm.js code, especially the type hinting and use of the heap buffer for temporary variable storage.

The first call to fooASM(..) is what sets up our asm.js module with its heap allocation. The result is a foo(..) function we can call as many times as necessary. Those foo(..) calls should be specially optimized by an asm.js-aware JS engine. Importantly, the preceding code is completely standard JS and would run just fine (without special optimization) in a non-asm.js engine.

Obviously, the nature of restrictions that make asm.js code so optimizable reduces the possible uses for such code significantly. asm.js won’t necessarily be a general optimization set for any given JS program. Instead, it’s intended to provide an optimized way of handling specialized tasks such as intensive math operations (e.g., those used in graphics processing for games).