Chapter 3: Natives - Natives as Constructors - 《You Don't Know JS: Types & Grammar（1st edition）》

Natives as Constructors

Natives as Constructors

For array, object, function, and regular-expression values, it’s almost universally preferred that you use the literal form for creating the values, but the literal form creates the same sort of object as the constructor form does (that is, there is no nonwrapped value).

Just as we’ve seen above with the other natives, these constructor forms should generally be avoided, unless you really know you need them, mostly because they introduce exceptions and gotchas that you probably don’t really want to deal with.

`Array(..)`

var a = new Array( 1, 2, 3 );
a; // [1, 2, 3]
var b = [1, 2, 3];
b; // [1, 2, 3]

Note: The Array(..) constructor does not require the new keyword in front of it. If you omit it, it will behave as if you have used it anyway. So Array(1,2,3) is the same outcome as new Array(1,2,3).

The Array constructor has a special form where if only one number argument is passed, instead of providing that value as contents of the array, it’s taken as a length to “presize the array” (well, sorta).

This is a terrible idea. Firstly, you can trip over that form accidentally, as it’s easy to forget.

But more importantly, there’s no such thing as actually presizing the array. Instead, what you’re creating is an otherwise empty array, but setting the length property of the array to the numeric value specified.

An array that has no explicit values in its slots, but has a length property that implies the slots exist, is a weird exotic type of data structure in JS with some very strange and confusing behavior. The capability to create such a value comes purely from old, deprecated, historical functionalities (“array-like objects” like the arguments object).

Note: An array with at least one “empty slot” in it is often called a “sparse array.”

It doesn’t help matters that this is yet another example where browser developer consoles vary on how they represent such an object, which breeds more confusion.

For example:

var a = new Array( 3 );
a.length; // 3
a;

The serialization of a in Chrome is (at the time of writing): [ undefined x 3 ]. This is really unfortunate. It implies that there are three undefined values in the slots of this array, when in fact the slots do not exist (so-called “empty slots” — also a bad name!).

To visualize the difference, try this:

var a = new Array( 3 );
var b = [ undefined, undefined, undefined ];
var c = [];
c.length = 3;
a;
b;
c;

Note: As you can see with c in this example, empty slots in an array can happen after creation of the array. Changing the length of an array to go beyond its number of actually-defined slot values, you implicitly introduce empty slots. In fact, you could even call delete b[1] in the above snippet, and it would introduce an empty slot into the middle of b.

For b (in Chrome, currently), you’ll find [ undefined, undefined, undefined ] as the serialization, as opposed to [ undefined x 3 ] for a and c. Confused? Yeah, so is everyone else.

Worse than that, at the time of writing, Firefox reports [ , , , ] for a and c. Did you catch why that’s so confusing? Look closely. Three commas implies four slots, not three slots like we’d expect.

What!? Firefox puts an extra , on the end of their serialization here because as of ES5, trailing commas in lists (array values, property lists, etc.) are allowed (and thus dropped and ignored). So if you were to type in a [ , , , ] value into your program or the console, you’d actually get the underlying value that’s like [ , , ] (that is, an array with three empty slots). This choice, while confusing if reading the developer console, is defended as instead making copy-n-paste behavior accurate.

If you’re shaking your head or rolling your eyes about now, you’re not alone! Shrugs.

Unfortunately, it gets worse. More than just confusing console output, a and b from the above code snippet actually behave the same in some cases but differently in others:

a.join( "-" ); // "--"
b.join( "-" ); // "--"
a.map(function(v,i){ return i; }); // [ undefined x 3 ]
b.map(function(v,i){ return i; }); // [ 0, 1, 2 ]

Ugh.

The a.map(..) call fails because the slots don’t actually exist, so map(..) has nothing to iterate over. join(..) works differently. Basically, we can think of it implemented sort of like this:

function fakeJoin(arr,connector) {
    var str = "";
    for (var i = 0; i < arr.length; i++) {
        if (i > 0) {
            str += connector;
        }
        if (arr[i] !== undefined) {
            str += arr[i];
        }
    }
    return str;
}
var a = new Array( 3 );
fakeJoin( a, "-" ); // "--"

As you can see, join(..) works by just assuming the slots exist and looping up to the length value. Whatever map(..) does internally, it (apparently) doesn’t make such an assumption, so the result from the strange “empty slots” array is unexpected and likely to cause failure.

So, if you wanted to actually create an array of actual undefined values (not just “empty slots”), how could you do it (besides manually)?

var a = Array.apply( null, { length: 3 } );
a; // [ undefined, undefined, undefined ]

Confused? Yeah. Here’s roughly how it works.

apply(..) is a utility available to all functions, which calls the function it’s used with but in a special way.

The first argument is a this object binding (covered in the this & Object Prototypes title of this series), which we don’t care about here, so we set it to null. The second argument is supposed to be an array (or something like an array — aka an “array-like object”). The contents of this “array” are “spread” out as arguments to the function in question.

So, Array.apply(..) is calling the Array(..) function and spreading out the values (of the { length: 3 } object value) as its arguments.

Inside of apply(..), we can envision there’s another for loop (kinda like join(..) from above) that goes from 0 up to, but not including, length (3 in our case).

For each index, it retrieves that key from the object. So if the array-object parameter was named arr internally inside of the apply(..) function, the property access would effectively be arr[0], arr[1], and arr[2]. Of course, none of those properties exist on the { length: 3 } object value, so all three of those property accesses would return the value undefined.

In other words, it ends up calling Array(..) basically like this: Array(undefined,undefined,undefined), which is how we end up with an array filled with undefined values, and not just those (crazy) empty slots.

While Array.apply( null, { length: 3 } ) is a strange and verbose way to create an array filled with undefined values, it’s vastly better and more reliable than what you get with the footgun’ish Array(3) empty slots.

Bottom line: never ever, under any circumstances, should you intentionally create and use these exotic empty-slot arrays. Just don’t do it. They’re nuts.

`Object(..)`, `Function(..)`, and `RegExp(..)`

The Object(..)/Function(..)/RegExp(..) constructors are also generally optional (and thus should usually be avoided unless specifically called for):

var c = new Object();
c.foo = "bar";
c; // { foo: "bar" }
var d = { foo: "bar" };
d; // { foo: "bar" }
var e = new Function( "a", "return a * 2;" );
var f = function(a) { return a * 2; };
function g(a) { return a * 2; }
var h = new RegExp( "^a*b+", "g" );
var i = /^a*b+/g;

There’s practically no reason to ever use the new Object() constructor form, especially since it forces you to add properties one-by-one instead of many at once in the object literal form.

The Function constructor is helpful only in the rarest of cases, where you need to dynamically define a function’s parameters and/or its function body. Do not just treat Function(..) as an alternate form of eval(..). You will almost never need to dynamically define a function in this way.

Regular expressions defined in the literal form (/^a*b+/g) are strongly preferred, not just for ease of syntax but for performance reasons — the JS engine precompiles and caches them before code execution. Unlike the other constructor forms we’ve seen so far, RegExp(..) has some reasonable utility: to dynamically define the pattern for a regular expression.

var name = "Kyle";
var namePattern = new RegExp( "\\b(?:" + name + ")+\\b", "ig" );
var matches = someText.match( namePattern );

This kind of scenario legitimately occurs in JS programs from time to time, so you’d need to use the new RegExp("pattern","flags") form.

`Date(..)` and `Error(..)`

The Date(..) and Error(..) native constructors are much more useful than the other natives, because there is no literal form for either.

To create a date object value, you must use new Date(). The Date(..) constructor accepts optional arguments to specify the date/time to use, but if omitted, the current date/time is assumed.

By far the most common reason you construct a date object is to get the current timestamp value (a signed integer number of milliseconds since Jan 1, 1970). You can do this by calling getTime() on a date object instance.

But an even easier way is to just call the static helper function defined as of ES5: Date.now(). And to polyfill that for pre-ES5 is pretty easy:

if (!Date.now) {
    Date.now = function(){
        return (new Date()).getTime();
    };
}

Note: If you call Date() without new, you’ll get back a string representation of the date/time at that moment. The exact form of this representation is not specified in the language spec, though browsers tend to agree on something close to: "Fri Jul 18 2014 00:31:02 GMT-0500 (CDT)".

The Error(..) constructor (much like Array() above) behaves the same with the new keyword present or omitted.

The main reason you’d want to create an error object is that it captures the current execution stack context into the object (in most JS engines, revealed as a read-only .stack property once constructed). This stack context includes the function call-stack and the line-number where the error object was created, which makes debugging that error much easier.

You would typically use such an error object with the throw operator:

function foo(x) {
    if (!x) {
        throw new Error( "x wasn't provided" );
    }
    // ..
}

Error object instances generally have at least a message property, and sometimes other properties (which you should treat as read-only), like type. However, other than inspecting the above-mentioned stack property, it’s usually best to just call toString() on the error object (either explicitly, or implicitly through coercion — see Chapter 4) to get a friendly-formatted error message.

Tip: Technically, in addition to the general Error(..) native, there are several other specific-error-type natives: EvalError(..), RangeError(..), ReferenceError(..), SyntaxError(..), TypeError(..), and URIError(..). But it’s very rare to manually use these specific error natives. They are automatically used if your program actually suffers from a real exception (such as referencing an undeclared variable and getting a ReferenceError error).

`Symbol(..)`

New as of ES6, an additional primitive value type has been added, called “Symbol”. Symbols are special “unique” (not strictly guaranteed!) values that can be used as properties on objects with little fear of any collision. They’re primarily designed for special built-in behaviors of ES6 constructs, but you can also define your own symbols.

Symbols can be used as property names, but you cannot see or access the actual value of a symbol from your program, nor from the developer console. If you evaluate a symbol in the developer console, what’s shown looks like Symbol(Symbol.create), for example.

There are several predefined symbols in ES6, accessed as static properties of the Symbol function object, like Symbol.create, Symbol.iterator, etc. To use them, do something like:

obj[Symbol.iterator] = function(){ /*..*/ };

To define your own custom symbols, use the Symbol(..) native. The Symbol(..) native “constructor” is unique in that you’re not allowed to use new with it, as doing so will throw an error.

var mysym = Symbol( "my own symbol" );
mysym;                // Symbol(my own symbol)
mysym.toString();    // "Symbol(my own symbol)"
typeof mysym;         // "symbol"
var a = { };
a[mysym] = "foobar";
Object.getOwnPropertySymbols( a );
// [ Symbol(my own symbol) ]

While symbols are not actually private (Object.getOwnPropertySymbols(..) reflects on the object and reveals the symbols quite publicly), using them for private or special properties is likely their primary use-case. For most developers, they may take the place of property names with _ underscore prefixes, which are almost always by convention signals to say, “hey, this is a private/special/internal property, so leave it alone!”

Note: Symbols are not objects, they are simple scalar primitives.

Native Prototypes

Each of the built-in native constructors has its own .prototype object — Array.prototype, String.prototype, etc.

These objects contain behavior unique to their particular object subtype.

For example, all string objects, and by extension (via boxing) string primitives, have access to default behavior as methods defined on the String.prototype object.

Note: By documentation convention, String.prototype.XYZ is shortened to String#XYZ, and likewise for all the other .prototypes.

String#indexOf(..): find the position in the string of another substring
String#charAt(..): access the character at a position in the string
String#substr(..), String#substring(..), and String#slice(..): extract a portion of the string as a new string
String#toUpperCase() and String#toLowerCase(): create a new string that’s converted to either uppercase or lowercase
String#trim(): create a new string that’s stripped of any trailing or leading whitespace

None of the methods modify the string in place. Modifications (like case conversion or trimming) create a new value from the existing value.

By virtue of prototype delegation (see the this & Object Prototypes title in this series), any string value can access these methods:

var a = " abc ";
a.indexOf( "c" ); // 3
a.toUpperCase(); // " ABC "
a.trim(); // "abc"

The other constructor prototypes contain behaviors appropriate to their types, such as Number#toFixed(..) (stringifying a number with a fixed number of decimal digits) and Array#concat(..) (merging arrays). All functions have access to apply(..), call(..), and bind(..) because Function.prototype defines them.

But, some of the native prototypes aren’t just plain objects:

typeof Function.prototype;            // "function"
Function.prototype();                // it's an empty function!
RegExp.prototype.toString();        // "/(?:)/" -- empty regex
"abc".match( RegExp.prototype );    // [""]

A particularly bad idea, you can even modify these native prototypes (not just adding properties as you’re probably familiar with):

Array.isArray( Array.prototype );    // true
Array.prototype.push( 1, 2, 3 );    // 3
Array.prototype;                    // [1,2,3]
// don't leave it that way, though, or expect weirdness!
// reset the `Array.prototype` to empty
Array.prototype.length = 0;

As you can see, Function.prototype is a function, RegExp.prototype is a regular expression, and Array.prototype is an array. Interesting and cool, huh?

Prototypes As Defaults

Function.prototype being an empty function, RegExp.prototype being an “empty” (e.g., non-matching) regex, and Array.prototype being an empty array, make them all nice “default” values to assign to variables if those variables wouldn’t already have had a value of the proper type.

For example:

function isThisCool(vals,fn,rx) {
    vals = vals || Array.prototype;
    fn = fn || Function.prototype;
    rx = rx || RegExp.prototype;
    return rx.test(
        vals.map( fn ).join( "" )
    );
}
isThisCool();        // true
isThisCool(
    ["a","b","c"],
    function(v){ return v.toUpperCase(); },
    /D/
);                    // false

Note: As of ES6, we don’t need to use the vals = vals || .. default value syntax trick (see Chapter 4) anymore, because default values can be set for parameters via native syntax in the function declaration (see Chapter 5).

One minor side-benefit of this approach is that the .prototypes are already created and built-in, thus created only once. By contrast, using [], function(){}, and /(?:)/ values themselves for those defaults would (likely, depending on engine implementations) be recreating those values (and probably garbage-collecting them later) for each call of isThisCool(..). That could be memory/CPU wasteful.

Also, be very careful not to use Array.prototype as a default value that will subsequently be modified. In this example, vals is used read-only, but if you were to instead make in-place changes to vals, you would actually be modifying Array.prototype itself, which would lead to the gotchas mentioned earlier!

Note: While we’re pointing out these native prototypes and some usefulness, be cautious of relying on them and even more wary of modifying them in any way. See Appendix A “Native Prototypes” for more discussion.

Natives as Constructors

Natives as Constructors

Array(..)

Object(..), Function(..), and RegExp(..)

Date(..) and Error(..)

Symbol(..)

Native Prototypes

Prototypes As Defaults

`Array(..)`

`Object(..)`, `Function(..)`, and `RegExp(..)`

`Date(..)` and `Error(..)`

`Symbol(..)`