Long ago I came to the conclusion that most of the difficulty in writing good software is artificially created. We spend much of our time wrestling with the languages, libraries, and tools that we are confronted with in a given environment. Very little of our time is actually spent thinking about or expressing the problem at hand.
This document presents JavaScript as an example of this phenomenon. While many of JavaScript's flaws and shortcomings receive their fair share of attention, others are there for JavaScript programmers to learn the hard way. Some are widely understood, some widely discussed but widely misunderstood, and others rather obscure. Most JavaScript programmers proceed in a state of ignorance, running the risk of introducing subtle bugs into their programs.
When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong. — R. Buckminster Fuller
The var
keyword declares a lexically scoped variable, but only when it appears inside of a function body. When it appears in a script outside of a function body, it adds an entry into the global object, making the variable visible to other scripts (even those that have already been executed). In order to make a variable local to an individual script, we can use this trick, which immediately executes an anonymous function:
(function() { var x; ... }());
In fact, this pattern has become idiomatic.
The language of the future, indeed.
JavaScript does not have block-level scoping; it only has function-level scoping. This shortcoming combines with other design decisions to produce behavior that may confound developers.
Declarations may appear inside a block, even though the scope extends outside the block.
Declarations may appear after references to that variable. In other words, the scope of a variable extends backwards from the declaration to cover code that precedes it.
Multiple declarations of the same variable name are allowed (and they declare only one variable).
Declarations other than the first will not assign undefined
to the variable.
(function(){ var a; // a is now 'undefined' a = 1; // a is 1 var a; // a is still 1 return a; }());
“Spooky action at a distance” is the enemy of the programmer. Ideally, a language will allow a programmer to inspect and analyze a piece of code in isolation. Block scoping helps the programmer by bounding the effects of changes.
For an example of how JavaScript's scoping allows non-local affects, consider the following code, in which var n = ...
could have been added for debugging purposes:
n = 100; (function(){ var a = n; // n is 'undefined' here, because... ... other intervening code ... ... (perhaps quite a bit) ... for (x = 1; x < a; ++x) { var n = x * 2; // ... it is declared as a local variable here. console.log(n); } })();
Here are two ways to define a function in JavaScript:
function f () { ... };
var f = function () { ... };
Many JavaScript programmers (and tutorials) consider these as being equivalent, and one might innocently jump to that conclusion, but they differ in a couple of ways, one quite subtle!
In the case of a function declaration (A), the function object is instantiated and assigned to the variable at the top of the scope (e.g. function or script).
With a variable declaration (B), the assignment is performed in the usual order (after the code above and before the code below).
Function declarations are not allowed in certain syntactic contexts. They are allowed at the top level of a program or the top level of a function body, but not within other blocks. Within other blocks, these may be interpreted as “function expressions”, which neither declare nor assign a variable in the parent scope.
As indicated in the 5th edition specification — and as stated at mozilla.org — “A function declaration is very easily (and often unintentionally) turned into a function expression.”
Consider this code excerpt:
function f() { return 1; } if (true) { function f() { return 2; } } else { function f() { return 3; } } alert( f() ); // care to guess?
What does it do? Well, that depends...
If the ECMA standard were your only guide, you would expect “1” to be displayed. The second and third occurrences of function f()...
are treated as expressions, since the ECMA grammar does not allow function declarations inside if
or else
clauses.
Firefox displays “2”. Mozilla.org explains that this is due to a Fireox-specific language extension allowed by ECMA-262 v3.
Safari (using JavaScript Core) and Chrome (using V8) display “3”. Presumably this would be an equally valid “extension” to the standard.
Consider these alternative code snippets for assigning window.onload
:
A) window.onload = function() { alert("x") } B) onload = function() { alert("x") } C) var onload = function() { alert("x") } D) function onload() { alert("x") }
Given the knowledge that window
is the global object in browers, one might conclude that all of these would be valid ways of assigning the global variable onload
. In Firefox, all of these snippets indeed set the onload
handler and cause the alert to be displayed.
In WebKit-based browsers, however, all of them work except D. In case D, the onload
handler will not fire, even though window.onload
does in fact get assigned. Perhaps even stranger, the following code (E) will also fail to set the handler:
E) function onload() { alert("x") } window.onload = function() { alert("x") }
One might dismiss this as a bug in WebKit, but WebKit developers argue that it is mandated by ECMAScript due to an interaction with the “shadow” global object, and it is Firefox that is in error: http://markmail.org/message/fraws5bthlrpt2m7
It boggles the mind that such a commonplace issue, easily encountered by novice developers, raises questions so mysterious or contentious that the widely deployed implementations remain incompatible.
this
Defaults to the Global Object
When a JavaScript function is called — not as a method as in obj.f(...)
, but simply as a function as in f(...)
— its implicit this
parameter will be set to the global object.
This is more insidious than it might at first appear, because member functions that get called without the dot syntax will likely end up polluting the global object. It is particularly easy to make that mistake with constructors (and they will almost always succeed without an exception).
One could think of a number of more intuitive or safer defaults for this
, any one of which would have been a better choice:
The caller's this
value.
undefined
null
The string "JavaScript is the language of the future!"
.
Edition 5 introduces “strict mode”, in which this
defaults to null
. Once edition 5 is widely implemented, developers might be able to migrate their code to strict mode. However, one problem with strict mode is that it leaves the language with no standard way to obtain a reference to the global object. When not in strict mode, one can use var global = (function () {return this;}());
, but when in strict mode one must rely on global variables that vary with the environment — window
in browsers, global
in Node.js.
eval
Believe it or not, calling eval
can inject variables into the scope (the “activation object” in JavaScript parlance) of the call site. If you are not already familiar with this behavior, you may not be prepared to believe it, and may be convinced that there is some miscommunication here ... so consider this concrete example:
x = 0; (function () { x = 1; eval("var x = 2;"); x = 3; }()); alert(x); // displays "1"
What eval
does depends upon how you call it. It might evaluate the code in the global scope or the current scope.
x = "global"; (function () { var x="local"; return eval('x'); }()) // returns "local" (function () { var x="local"; return window.eval('x'); }()) // returns "global"
Comprehensible documentation on this difference is hard to come by. I have found the following examples useful in attempting to understand what rules are employed:
eval(...) // local
window.eval(...) // global
var eval = window.eval eval(...) // local
var eval = window.eval.bind(window) eval(...) // global
var Eval = window.eval Eval(...) // global
(function(eval) { eval(...); })(eval); // local
(function(eval) { eval(...); })(eval.bind(null)); // global
eval.call(null, ...); // global
(eval)(...) // local
(0,eval)(...) // global
[eval][0](...) // global
eval
The ability to construct and execute source code at run-time is a powerful feature of many dynamic languages, enabling (among many other things) package managers to be written. A good form of eval would be a desirable thing.
“If the string represents an expression, eval evaluates the expression.” Not so...
eval('{a:1}') // evaulates to 1 eval('{a:1}["a"]') // evaulates to ["a"]
Neither eval nor the Function constructor accept a file name or URL or other description of where the source code came from. This means that debug facilities cannot identify the origin of the code when reporting errors.
delete
The eval
function has other “special” behavior that (in conjunction with the notion of delete-ability) makes the delete
statement deliciously esoteric. It can be used to delete variables, but in some cases that does not work, so you are better off using it only for deleting object properties (although that may not work either). You can try things out interactively in an interactive session in Firebug or Web Inspector, but the results might not be consistent with what happens when you put the same code in a script.
This is all explained in Understanding Delete, a thoroughly researched examination of the matter which unintentionally serves as a more thorough indictment of JavaScript than a detractor could ever write. It shows how a concept as seemingly simple as delete
manifests in JavaScript as a feature that requires many, many pages to fully explain and results in confusion or disagreement on the part of programmers, authors of books on JavaScript, and implementers of JavaScript engines.
Automatic Semicolon Insertion (ASI) refers to a specific part of the ECMAScript standard that is written as a kind of addendum to the grammar. This works in combination with the JavaScript grammar's restricted productions to create a phenomenon that would more accurately be called capricious semicolon insertion. Sometimes two lines will be treated as one statement where two were intended, and sometimes two lines will be treated as two statements where one was intended.
Holy wars have been waged on differing philosophies for dealing with semicolons in JavaScript. Some people recommend habitually terminating statements with semicolons, and using “lint” tools to verify this usage. Unfortunately, this does not succeed completely in protecting the programmer from unintentional errors, and it might make a different class of errors more likely, so others argue vehemently against this practice.
The “semicolonista” sect appear most concerned about cases like the following:
a = b [c,d].forEach(...)
Which JavaScript treats as:
a = b[c,d].forEach(...)
Lines beginning with /
, (
, +
, or -
can also result in similar unexpected behavior, so one must be careful to use a semicolon to delimit the statements. The antisemicolinstas consider these special cases, and recommend a semicolon at the start of any such syntactically problemetic statement. The semicolonistas argue that habitually using semicolons is more fail-safe. This can be easier said than done, however.
var f = function () { return console.log.bind(console); } (function (n) { console.log(n + 1); })(1);
The above code logs 1
to the console, not 2
.
Antisemicolonistas might point out that when humans get accustomed to reading (and mentally parsing) code with semicolon-terminated statements, they might naturally place too much significance on semicolons. When writing semicolon-decorated code a user might be more likely to write the following, and innocently read it as one statement:
return expression;
JavaScript instead treats this as equivalent to:
return; expression;
One's only defense against this is remembering the cases where line terminators are disallowed, and avoiding line breaks there:
between return
and any value being returned
between throw
and its subsequent expression
between break
or continue
and the subsequent label
between an expression and a postfix ++
or --
operator
Regardless of the cult to which you subscribe, when programming in JavaScript you can look forward to the following:
Time spent learning syntactic special cases.
Time spent learning about recommended practices and the pitfalls of each.
When writing code or auditing code, thinking about whether code runs afoul of the line terminator exception cases.
Time spent chasing bugs.
There are a number of automatic conversions that are applied by ==
but not elsewhere in the language. For example, all of the following expressions evaluate to true:
null == undefined false != undefined false != null false == 0 false == " \r\t\n " true == 1 true == "1" true != "true"
You might think of equality as transitive, but you would be mistaken.
"1" == 1 1 == "1.0" "1" != "1.0"
A programmer might be tempted to draw other conclusions that are logical and valid in other languages, but invalid in JavaScript. For example, in JavaScript:
If a==b
and typeof a == "number"
, can we assume a==Number(b)
? No.
If a==b
, can we assume String(a)==String(b)
? No.
If a==b
, can we assume array[a] == array[b]
? No.
These subtleties make it harder to reason about the correctness of programs. Most practicing JavaScript programmers do not know exactly what ==
does. For those who do and who care about correctness, their reward is the tedium of thinking through all of the special cases of ==
when they analyze each line of code.
Instead of trying to remember what ==
does, we can instead treat it as a mistake and completely avoid it — an approach that JSLint and CoffeeScript embrace. JavaScript provides the ===
operator which is much simpler. For example:
“if (a === b) ...
” can be read as “If a and b hold the same value ...”
“if (a == b) ...
” can be read as “If a and b hold the same value after converting undefined
values to null
, or if one holds a number (after conversion) and the other holds a string or boolean or Boolean
object whose numeric value equals that number, ...”
Alas, the above description is incomplete. It does not explain the following:
new Boolean(false) == "0" new Boolean(false) != new Boolean(false) new Boolean(false) == 0
Suffice it to say that a complete description of ==
would be much more lengthy than a complete description of ===
.
[Note: the ===
operator has its own pitfalls that stem from the IEEE-754 floating point standard. The special value NaN
tests as not equal to itself. Also, there is a -0
value that tests as equal to 0
, even though it is a distinct value and can in fact be distinguished from +0
using other means. Given the state of computer hardware, it is reasonable for JavaScript to inherit this problematic behavior from IEEE-754, as does just about every other modern programming language does.]
One might expect that the subtleties of ==
might also express themselves in the way conditional expressions are evaluated. In JavaScript, however, “truthiness” has its own rules, as arbitrary as — but not consistent with — the rules for equality. As a result, values that are considered “equal” (with the ==
operator) do not necessarily have the same “truthiness”:
if (0 == "0") isExecuted(); if (0) isNotExecuted(); if ("0") isExecuted(); if (0 == []) isExecuted(); if ([]) isExecuted();
The following seven values are considered falsy by JavaScript:
false null undefined '' 0 -0 NaN
All other values are truthy.
JavaScript has not one, but two values that represent no value. (One wonders why, if two is better than one, three is not better than two. Perhaps future ECMAScript standards will address this.)
If you choose to write in a restricted subset of JavaScript that avoids the pitfalls of equalishness (by using only ==
) and truthiness (by performing explicit comparisons), you will find yourself frequently writing the following ...
if (a !== null && a !== undefined) { ... }
undefined
The ECMAScript standard uses the word “define” to refer to creating a binding for a property or variable. Logically, then, an undefined variable or property is one for which a binding does not exist.
This terminology, combined with the undefined
value, makes it difficult to speak clearly about variable or property bindings. A defined variable may be undefined
, because “defined” means it is associated with a value, and that value may be undefined
. Likewise, an undefined variable can never be undefined
.
This confusion would not exist if the value had been called null
or nil
.
In strict mode, referencing an undefined variable will generate an exception ... except when it doesn't.
u // ReferenceError: u is not defined typeof u // undefined typeof (u) // undefined typeof (u,u) // ReferenceError: u is not defined
There is more than what first meets the eye with typeof
. You might at first think of it as a function that accepts a value (as in other dynamic languages). After all, we are told that in JavaScript, values have types and variables do not. Yet typeof
sometimes seems to operate on the variable itself, not the value of the variable.
Here is another example of JavaScript complicating the mental model of the language that programmers need to keep in mind.
JavaScript is infested with implicit references to global objects. Every array, object, or function you create brings along this baggage. As a result, creating a sandbox for JavaScript code within a JavaScript VM requires writing native (non-JavaScript) code calling implementation-specific APIs.
Other languages, including Scheme and Lua, allow sandboxing to be implemented due to the fact that they allow a programmer to load and execute code while maintaining complete control over scoping. The concepts are discussed in the paper A Security Kernel Based on the Lambda-Calculus.
What is bad for security is often bad for robustness and reliability. Security is just one aspect of correctness, and the main problem in isolation — enabling access to data that should be accessed while restricting access to data that should not be accessed — is very similar to the problem of data encapsulation, which is fundamental to building reliable and maintainable programs. Globals, to which JavaScript is so attached, are toxic to security and correctness in general.
“Maps”, “hashes”, or “tables” are universal among modern dynamic languages, and are typically a workhorse data type in those languages.
To fully exploit the power of a dynamic language, maps should be fully polymorphic. That is, the values and keys should be of whatever type the program populates them with.
JavaScript allows only strings as keys. If you use any other type of value as a key it will be converted to a string.
JavaScript arrays can contain arbitrary values (just not as keys), so one could implement an object that would maintain a mapping from, say, function objects to strings, but it would execute in O(N) time.
It gets worse. JavaScript objects are not even appropriate for use as hashes indexed by strings. Ironically, these limitations make JavaScript less suitable than many other languages for dealing with JSON data.
The first fly in the ointment is that all objects inherit properties. Even empty object literals inherit from Object
, which has properties such as toString
and hasOwnProperty
.
Attempting to create an object with a null prototype (by declaring a constructor and assigning its prototype
property to null
) will result in an object with Object
as its prototype. (Counter to what is implied by documentation.)
Then there are magic properties which cannot even be assigned arbitrary values. Try, for example, a = []; a["__proto__"] = 2; alert(a["__proto__"]);
in Firefox, Chrome, or Safari.
The ECMA standard does not specify or even mention magic properties, but it includes loopholes large enough to allow implementations to do just about anything. The standard is not on your side if you are looking for a theoretical basis for correctness in your programs.
ES5 (ECMAScript 262 release 5) introduces Object.create
which does allow creation of objects with null
prototypes. However, this is of little use because magic properties manifest even in these objects.
Update: Recent versions of Firefox, Safari, and Chrome appear to no longer support the __proto__
magic property on objects that have been created with Object.create(null)
. This seems encouraging, but it is a bit troubling that this appeared without fanfare, and what is preventing __proto__
from reappearing on the scene?
One can only imagine how many bugs lurk in JavaScript programs as a result of programs thinking that JavaScript objects are similar to the Hashes, HashMaps, or Dictionaries in Perl, Ruby, Python and Java (as many JavaScript books state).
JavaScript does not allow programmers to create weak references. Weak references enable a number of useful programming techniques. For example, memoization can can employ weak references to keep cached values around as long as (and no longer than) the input value is still held in the VM.
JavaScript lacks coroutines, which provide a natural way to deal with multiple independent network connections. Coroutines also allow pre-existing synchronous code to be composed with other synchronous code in more flexible ways, without rewriting everything to be asynchronous.
JavaScript does not define a JavaScript-accessible API that allows JavaScript code to be debugged. For example, Web Inspector Remote (weinre
), which is written entirely in JavaScript, cannot support debugging. With a full-fledged debug API in the language (and with co-routines) this would be achievable.
JavaScript's for (... in ...)
construct has several shortcomings.
There is only one loop variable and it is the key, not the value, so almost every loop must include another line of code to look up the value given the key and the array (which in turn may require an additional statement to initialize a variable with the array itself).
for in
does not guarantee ordering, making it unsuitable for common array iteration use cases. What is worse, current browsers have traditionally provided some guarantee of ordering that is not guaranteed by the standard, so within any sizable JavaScript code base there lurks some number of bugs waiting to happen.
Inherited members are enumerated (sometimes), and the specification of ECMAScript allows implementations to extend this set arbitrarily, so code using this construct cannot distinguish which keys have been deliberately placed in the set. For correct behavior, additional code is required.
The iteration variable's scope extends beyond the for
statement. This is perhaps to be expected since JavaScript has no block scoping, but the alternative of implicit block scoping makes for code that is cleaner and less error-prone code.
The upshot is that a simple concept cannot be simply expressed. Take, for example the following Lua code:
for key, value in ipairs(<array_expression>) do <body> end
... which represented hygienically in JavaScript becomes:
(function () { var _array = <array_expression> for (var key = 0; key < _array.length; ++key) { if (Object.prototype.hasOwnValue.call(_array, key)) { var value = _array[key]; <body> } } }());
The call to hasOwnValue
could be omitted if one assumes that the array inherits no enumerable keys. Such keys could have been added to the Array
object by other scripts or even by the JavaScript implementation itself.
JavaScript's for in
construct applies only to arrays. It cannot be employed for user-defined enumrations.
JavaScript does not support operator overloading.
While overloading can be abused, it is quite valuable where appropriate.
In JavaScript, host objects have many special powers and peculiarities that are not implementable in JavaScript alone. These were artifacts of the mechanism that bound C++ implementations to JavaScript.
The ECMAScript Edition 5 specification adds a fair amount of complexity to allow JavaScript code to implement some of the peculiarities of host objects, including non-enumerable and “non-configurable” properties, getters, and setters. (But not interceptors...)
A practical implication of all this is that browser APIs present functionality that cannot be accurately simulated for the purpose of unit testing, or sub-classing in order to provide extended functionality.
The ECMAScript specifications are nearly impenetrable by the ordinary developer interested in understanding the language. Unlike other language specifications that read as sets of requirements, the ECMAScript standards have devolved into a type of pseudo-code for the language implementation — no more readable than actual code, yet less formal.
Implementations, in turn, are cagey about what level of standard support they provide. If your task is to write portable JavaScript, you must experiment with actual implementations or rely on secondhand information about them.
Even where implementations do conform to the standard, you are left with the problem that the standard is intentionally loose in many areas. It can be easy for implementation dependencies to creep into your code. For example, “Whether or not a native object can have a host object as its prototype depends on the implementation.”
Regarding unintended inheritance, the committee members could not arrive at a consensus on any limitation on differences between implementations. This makes it impossible to write a non-trivial program that would be guaranteed to work on all standards-conformant implementations.
JavaScript does not support multiple return values. For example, one cannot write the following:
x, y = polarToCartesian(r, theta) scheme, auth, path = splitURL(u)
In order to return multiple values a JavaScript function would have to return an object. This is less efficient and more verbose.
with
The problems with with
have been thoroughly covered elsewhere, as in Douglas Crockford's With Statement Considered Harmful.
In JavaScript, +
can mean either addition or concatenation, depending on the types of the arguments.
JavaScript also will automatically convert numbers to strings or string to numbers. For example, a - b
will subtract the numeric value of b
from the numeric value of a
, returning a number. But +
presents a quandary. Whether a programmer wants to add or concatenate, explicit type conversion must be used. Failing to do so can result in the infamous “1 + 1 yields 11” syndrome that JavaScript programmers encounter at one point or another.
Some dynamic languages provide for automatic conversion between number and string types, and some do not. Supporting coercion is often a controversial choice. In its defense, however, one can point out that in certain problem domains — databases, utilities, networking software, to name a few — often deal with character data and moving numeric data in and out of strings. In these domains automatic type coercion, specifically to and from strings, can offer value to programmers. For that to work effectively, however, the programmer needs to know what coercion to expect, so it is important to provide different operators for numeric and string operations (namely, add versus concatenate). Examples of this approach go back to at least the 1960's in TRAC and Pick/BASIC.
Based on these two language design choices we can classify languages into one of four quadrants.
Overloaded + | No Overloaded + | |
---|---|---|
No type coercion | ok | ok |
type coercion | evil | ok |
JavaScript lies in the lower left quadrant.
JavaScript strings are sequences of 16-bit unsigned integers. This makes them ill-suited to containing binary data. No other type in the language is suited for that, either. For dealing with binary data we need yet more complexity.
In other languages, objects or libraries that deal with reading and writing of raw data would be decoupled from the complexity of text encoding and decoding. In JavaScript, however, this separation of concerns could be too expensive, so we tend to see more complicated APIs, such as the W3C File API).
JavaScript VMs in some browsers implement an ArrayBuffer
type that works with Typed Arrays. This feature is surprisingly complicated, but it does allow scripts to efficiently manipulate binary data as well as convey it.
Taking a similar but incompatible approach, Node.js (following Common.js) provides a native Buffer
object that helps on this front.
JavaScript modules that manipulate binary data will therefore fall into one of three competing camps that do not interoperate well — strings vs. ArrayBuffer
vs. Buffer
.
There is a reason for Java to have boxed and unboxed types. It is a strongly typed language that does not support polymorphism, except via object-oriented single inheritance. A variable whose type is a class can hold an instance of any derived class. Boxed types all inherit from Object, which allows them to be used interchangeably. Unboxed types cannot.
But JavaScript is a dynamic language, so it is a mystery why boxed types exist.
They don't behave at all like their corresponding primitive types. For example, some functions — eval()
, for example — treat “strings” and “Strings” differently.
And Booleans
do not behave like booleans
:
if (new Boolean(false)) isExecuted();
They don't serve any apparent purpose, in fact, except to lure novice programmers into the mistake of actually using them.
The expression typeof null
evaluates to 'object'
.
This is a cruel joke on the part of JavaScript, because actually, aside from undefined
, null
is the only value in JavaScript that cannot be treated as an object. The expressions true.x
, "abc".x
, (1).x
, and (function (){}).x
do not throw an exception, but null.x
does. In fact, in earlier versions of Chrome, typing null.x
in the JavaScript console yielded the following error message:
'null' is not an object (evaluating 'null.x')
There are 59 reserved words in JavaScript, many of which are not used in the language. By comparison, Lua has 21 and C has 32, and their keywords are actually used in the language so the programmer would know about them anyway. When writing JavaScript, be sure to remember all of these and not use one as a variable name, label, or property name.
abstract boolean break byte case catch char class const continue debugger default delete do double else enum export extends false final finally float function goto if implements import in instanceof int interface long native new null package private protected public return short static super switch synchronized this throw throws transient true try typeof var volatile void while with
To motivate yourself to memorize these, just remember that JavaScript is the language of the future.
JavaScript has built-in facilities for object-oriented programming that are awkward and confusing. Evidence of this can be found in the popularity of frameworks that provide alternative models for dealing with inheritance and construction of objects.
Many JavaScript adherents point to many advantages of “prototype-based” inheritance over class-based inheritance, and in my opinion they are correct. They also complain that programmers fail to see this because of their previous experience with class-based languages, but that is where they are incorrect. In truth, JavaScript itself is the biggest obstacle to understanding JavaScript's object model.
Consider the following “native JavaScript” approach to creating specialized objects:
function B() { } B.prototype = new A; B.prototype.m = function (...) { ... }; // define new method o = new B(); // create instance
This code constructs an inheritance chain of three objects. One might think of the inheritance chain as taking the following form:
And in fact this would be the case in a language that naturally and simply expresses “prototype-based” inheritance. However, this is not the case in Javascript. Instead, o
inherits from its prototype, which is the same as B.prototype
, which inherits from its prototype A.prototype
.
It is difficult to accurately talk about what is going on. One might say “B inherits from A”, which is concise ... but incorrect and insidiously misleading. It is not hard to see why so many JavaScript programmers remain mystified and use boilerplate like magical incantations when it comes to defining prototypes.
Once developers comprehend what is going on, they can see how the prototype-based model is actually simpler than class-based inheritance, and they can begin to enjoy its benefits.
Here again is a situation where JavaScript's built-in facilities are best hidden under libraries or avoided entirely.
toString
A function called toString
has special powers of type detection, but it inhabits another region of the language that is not fully defined.
Chrome v17 | Firefox v7.01 |
---|---|
> toString === window.toString false > toString() "[object Object]" > window.toString() "[object DOMWindow]" > toString.call(window) "[object global]" > toString.call(new Date()) "[object Date]" | > toString === window.toString true > toString() "[object Window]" > window.toString() "[object Window]" > toString.call(window) "[object Window]" > toString.call(new Date()) "[xpconnect wrapped native prototype]" |
After some Googling, I found someone else has puzzled over the same thing.
++x
is not the same as x += 1
or x = x + 1
.
It is, however, equivalent to x = +x + 1
.
The standard library is polluted with silly features like String.sub
, String.big
, etc.
Seventeen levels of operator precedence, slavishly adapted from C syntax. Can you recite them?
JavaScript regular expression objects contain state (the match result status).
JavaScript regular expressions are not implemented consistently. You may run afoul of this problem even if you want to do something as simple as match any character.
The .
character in a regular matches all characters except the newline character which Mozilla apparently understands to mean “\n”, while IE treats either “\n” or “\r” as newline characers. JavaScript regular expressions provide no special character for matching all characters (!), but we might construct one.
[^]
, which would read as “match any character not in the empty set”, apparently does not work in IE.
[.\n]
will not work because .
is treated as literal when inside of []
.
We can use (.|\n|\r)
, but that is much slower than .
in most regular expression engines.
We can use two sets that complement each other: [\D\d]
, [\S\s]
or [\W\w]
. I recommend [\D\d]
because the sets described by \s
and \w
are more complicated than \d
, so you might expect them to be slower on some implementations (which seems to be the case, at least with [\w\W]
on Safari).
The void
keyword is very rarely useful and never necessary. One could use (<expr>, undefined)
instead of void <expr>
.
Getter/setter misuse
document.cookie = "name=oeschger"; document.cookie = "favorite_food=tripe"; alert(document.cookie); // displays: name=oeschger;favorite_food=tripe
[1,2,3,]
throws an error on some browsers, yields and a four-element array on others, and a three-element array in others.
Rules for inheritance are different for Array
objects than other objects.
arguments
is an array-ish object, but not an Array
, so it does not inherit methods like shift
(which would be useful here).
Stack inspection: functionObj.caller
and functionObj.arguments
.
What exactly are the core types in JavaScript? Lua documentation lists a finite number of types and describes how they behave. Once you learn that, you know the rules. In JavaScript, is an Array
really an Object
? Or is it a separate core type?
String
, Number
, Boolean
serve curious double-duty:
new String(x)
returns a String (not a string).
String(x)
returns a string (not a String).
Date
does not fit this pattern. new Date(x)
returns a Date, while Date(x)
returns a string.
Function
behaves the same whether it is called as a constructor or an ordinary function.
Scope of 'x' in try { ... } catch (x) { ... }
?
Criticism is something we can avoid easily by saying nothing, doing nothing, and being nothing. —Aristotle
As evidenced by the examples above, JavaScript's complexity is multi-faceted and much deeper than initial impressions might indicate. Writing correct, non-trivial software without understanding the language you are writing in is a dicey proposition, but it is the only realistic option for most JavaScript programmers.
While writing any code, one must keep in mind the “every day” issues. For example:
Special cases for truthiness
Special cases for equalishness
Imperfect hashes
Function syntax ambiguity
Exceptions to the model for evaluation, as with eval
and typeof
arguments
is an array-like object, not an array.
But you cannot ignore the dark corners.
Programmers will be working on code written by others or calling into libraries written by others. This code may venture into the darker realms (via eval
, for example) or simply into more complicated areas (such as tweaking enumerability or configurability of properties). Code you write might not actually be correct unless you are aware of what exactly is going on.
One pattern that emerges is that JavaScript is not a “complete” language. Systems built on JavaScript end up relying more on features that must be implemented in native code because they cannot be implemented purely in JavaScript.
Debugging
Sandboxing
Binary data
Ironically, while Lua is famous for being tiny and designed for embedding and interfacing easily with native code, it does not need to rely on native code and implementation-specific APIs for these features.
A common excuse for JavaScript's flaws is that they were design trade-offs beneficial for novices, but this theory does not stand up to close examination.
Could it be that ==
treats the number 1 and the string “1” the same for the sake of novices? Is the theory that novices can get by without knowing the difference between strings and numbers? If so, what do these novices do when 1 + 1 yields 11? Is that a novice-friendly result? There could have been a novice-friendly language design choice made here — complete unification of number and string types — but JavaScript did not take that path.
Why build on C and Java syntax? Because most novices are expert C programmers and can recite the seventeen levels of operator precedence?
The C language treatment of 0 as false made complete sense to all of the seasoned assembly language programmers who were the primary audience for C when it was introduced, but that was solely because of their familiarity with CPU hardware. In a modern language, neither aficionados nor novices expect 0 == false
. Nor do they expect [] == false
... especially when []
is not falsy.
In JavaScript, conditionals recognize seven values as “falsy” (versus one or two in Perl, Ruby, Clojure, Lisp, Scheme, and Lua). This cannot make things easier for novices.
All these things are bad for expert programmers and bad for novices. John Resig lays out some ideas along these lines in an article on planning for teaching JavaScript as a First Language.
At the same time, much of what is good for novices is also good for experts: keep things as simple as possible, but no simpler, and keep less-often-used features out of the way.
Many of the problems in JavaScript are things that could be fixed. That is, a new language could be defined, preserving the important features of the language while whittling down ill-advised features. However, after 15 years, it is remarkable how little things have changed. And the ECMAScript committee seems dedicated to never fixing anything, only adding new features.
ES6 in particular is a tragic lost opportunity, because it includes a lot of valuable and well thought out features, many of which are not compatible with existing interpreters. Programmers writing in ES6 will have to carefully segregate that code from ES5 code, and use cross-compilers to generate ES5 from ES6 — until one day far in the future when all the browsers support ES6. As a result there is no need for ES6 code to continue to, for example, retain Capricious Semicolon Insertion, intransitive equality, absurd truthiness rules, and provide no distinction between addition and concatenation. None of these properties are important for interoperability with scripts written in older dialects of JavaScript. The ECMAScript committee apparently just likes it that way.
It has been claimed that one can limit oneself to a subset of JavaScript — the “good parts” — and simply sidestep the bad parts. The problem is that one is still left with a flawed language due to the “ugly parts” ... the simply unavoidable things that threaten each line of code that you write. And then there are the absent parts: the missing features that would have allowed for more readable and performant programs and a better programming experience.
Perhaps we should not expect much from a language that was written in 10 days. Nevertheless, we hear that JavaScript is the language of the future.
That might be true after all ... in a dystopian future. It might be that, in the future, all code will be throw-away code. Programmers will not really understand what they write, but after considerable trial and error they will produce scripts that amuse themselves and demonstrate their sk1llage to other 31337 haxxorz. The more obscure and ugly, the better. Fortunately, they will not need to rely on the code they write, because all the code that really matters — running the food fabricators and other machines that support humanity — will have been written long ago in long-forgotten languages.
Then again, if our future really is so bleak, maybe a mode likely candidate for the language of the future is PHP.
MDN: Documentation, as it were, from the source.
JavaScript: The World's Most Misunderstood Programming Language
ECMA-262: I dare you to read this.
Please send corrections, objections, or further insights in email to b -hk@bhk.com.
An expert is a person who has made all the mistakes that can be made in a very narrow field. —Niels Bohr