JavaScript for C# programmers: magic semicolons

Another fenced-off area on the road to being a JavaScript master dev when you know C#.

With this episode we look at JavaScript's charming ability to add semicolons when you forget them.

When you write C# code and forget a semicolon at the end of a statement, Visual Studio gives you a gentle hint with a red squiggly where it thinks a semicolon should be and the compiler throws a build-stopping (heart-stopping?) error if you miss the delicate hint. Basically, you have got used to adding those semicolons. Well, don't stop doing it.

You see, JavaScript has a pretty terrifying ability to add them for you, sometimes where you want them and, equally, sometimes where you don't. Let's explain.

There are certain statements in JavaScript that must be terminated with a semicolon, just as they are in C#. These statements include variable statements (essentially statements that declare and initialize variables), expression statements (for example, calls to functions as procedures), do-while statements, and the continue, break, return, and throw statements. These must be terminated, but JavaScript will automatically insert semicolons in the stream of tokens if you forget and when it believes it should.

The interpreter tokenizes the code in order to execute it. If it hits a token that's not allowed by the grammar, it'll insert a semicolon before the "bad" token and try again (there are a couple of conditions for this to happen — there must be a line break before the "bad" token or the "bad" token is the closing brace). If it reaches the end of the code and believes the code is incomplete, it'll add a semicolon at the end and try again. Those are what you might call the benign cases.

The non-benign cases involve things called "restricted productions". JavaScript won't allow line breaks immediately after the continue, break, return, and throw statements. If it does find one, it'll insert a semicolon. (There's a similar rule for the post-increment and post-decrement operators, but no one I know is crazy enough to write something like

counter
++;

so we'll ignore those — just don't do it, OK?)

Let's look at an example:

if (!wotsit.isValid) 
    throw 
    {
        message: "wotsit is not valid"
    };

Looks innocuous enough: if the wotsit object isn't valid, we throw an exception object that has the property and value shown.

Except that's not what it means.

The problem is the keyword throw cannot be separated from what it's going to throw by a line break. The rule is that the object being thrown must be defined (or you must start to define it) on the same line as the keyword. Since this code doesn't, JavaScript will add a semicolon like this:

if (!wotsit.isValid) 
    throw ;
    {
        message: "wotsit is not valid"
    };

Yikes, I think is the least exclamation you can make. The whole intent of the code has been changed. Luckily, as it happens, the automatic addition of the semicolon also causes a syntax error as well, so maybe all is not lost (although the syntax error could be puzzling, if you didn't know what was happening).

However, do the same with return inside a function:

var doSomething = function(wotsit) {
    if (!wotsit.isValid) 
        return
        {
            message: "wotsit is not valid"
        };
    // more code
};        

And you'll get no error, even though the automatic insertion of a semicolon actually produces this:

var doSomething = function(wotsit) {
    if (!wotsit.isValid) 
        return; // <-- automatic semicolon
        {
            message: "wotsit is not valid"
        };
    // more code
};        

Good luck with finding that error. Especially since JavaScript doesn't show you a nice little comment like I have.

You should have written it this way, essentially:

var doSomething = function(wotsit) {
    if (!wotsit.isValid) 
        return {
            message: "wotsit is not valid"
        };
    // more code
};        

In other words, providing you have been following along properly, the C# style choice of having the opening brace on the previous line and not on a new line may, one day, save your coding life and what remains of your hair in JavaScript.

So, my advice is continue to put in semicolons where you know they're expected. Do not rely on JavaScript's "convenience" feature of inserting them automatically and hoping the interpreter catches any syntax errors. (Besides which, if you don't, you slow down the interpreter since it has to stop and back up one token.)

And please put opening braces on the previous line.

(For those who are trying to work out what the interpreter understands by the code once it inserts the semicolon after return, here goes. The return returns undefined. The opening brace denotes the beginning of a block, not the start of an object literal. The identifier message is taken to be a label. The string is tokenized fine, but the interpreter determines that another automatic semicolon should be inserted after it, since the next token is a closing brace. That's an expression statement, and has the effect of the string being created and thrown away. The closing brace denotes the end of the block. The final semicolon, explicit this time, denotes an empty statement, which is allowed in JavaScript like in C#, and essentially does nothing. Voilà. It all disappears in a puff of smoke. No syntax error.)

 

Album cover for Protection Now playing:
Massive Attack - Weather Storm
(from Protection)


Loading similar posts...   Loading links to posts on similar topics...

1 Response

#1 Dew Drop - March 11, 2009 | Alvin Ashcraft's Morning Dew said...
11-Mar-09 6:27 AM

Pingback from Dew Drop - March 11, 2009 | Alvin Ashcraft's Morning Dew

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response