Posts tagged with 'closure'

JavaScript for C# developers: currying

There’s a concept in functional programming called currying which gives you the ability to create one function from another, and have the created function call the original but with some arguments prefilled in. To be ultra-rigorous, we assume the original function has n arguments, and currying produces a chain of n functions each taking one argument to produce the same results as the original. We, however, will skip that rigorous definition since it’s too constricting for our discussion here. Instead  we’ll talk about partial function application, but call it currying anyway. In essence: we want to call function A that takes lots of parameters, but since some/most of them are preset to some default values, we’d prefer having another function B that calls A filling in those presets to save time/effort.

Indian Curry FoodNow, ordinarily, people explain currying with reference to mystical functions that add two numbers. I’m not happy with those types of examples because all I’m left with is a big feeling of “So What?” when I run through one of those.

Instead let’s illustrate what currying is about by looking at the setTimeout function. This, as you should know, takes a function and an expiry time in milliseconds, and, after the time has expired will execute the function. Let’s suppose you’re writing some code and you notice that all your timeouts tend to be 10 milliseconds long (using a small timeout like this is a technique to split up long-running code and allow the execution engine time to “breathe”). If you are like me, you’d write a function that called setTimeout with a default of 10 milliseconds:

var delay10 = function(f) {
  setTimeout(f, 10);
};

Nothing too difficult. After a while, you notice that there are some other calls to setTimeout that use a default of 5 seconds: for example, your code displays some element in the browser and if the user does not respond to it within 5 seconds you remove the element. Your first thought might be to write this next function:

var delay5secs = function(f) {
  setTimeout(f, 5000);
};

But then there’s part of you that recognizes that this is code duplication. Tiny, but still there. You wrote it (just like I just did) using copy-and-paste and altered a single value. Because we understand that functions are objects in JavaScript, we should be able to write a function that takes an expiry time and that returns a function that calls setTimeout with that delay:

var makeDelayFunction = function(time) {
  return function (f) { 
    setTimeout(f, time); 
  };
};

So here the makeDelayFunction takes a time value and then returns a function that takes a function to execute and calls setTimeout with that function and the original time value. Through the wonder of closures we encapsulate the original time argument and then use it in the function that’s returned. Our two delay functions can now get created like this:

var delay10 = makeDelayFunction(10);
var delay5secs = makeDelayFunction(5000);

And, bingo, we’ve removed the code duplication. What we’ve just done is to curry the setTimeout function (again, I’m using a looser definition of curry here). We’ve created a function that returns functions that call setTimeout with a preset time value, and the preset value is stored in a closure. Now, admittedly, with my setTimeout example the benefits aren’t that earth-shattering, but at least it shows the principle.

Let’s take it a little further now. The addEventListener function takes three parameters: the event type (a string with various predefined values), the listener (a function that will be called when the event happens), and the capture flag (which for our purposes can be false). The function is called on some element of the DOM. Let’s suppose we want to add mouse click listeners for a whole bunch of elements. Of the four parameters to the function (counting the object we’re calling it on as a parameter) we have two fixed parameters and two “floating”. Our first thought, given what we’ve learned above, is this:

var makeListener = function (type) {
  return function (element, func) {
    element.addEventListener(type, func, false);
  };
};

var addClick = makeListener("click");
addClick(myDiv, function () { alert("myDiv was clicked!"); });

Looks pretty good, but note the code duplication. It’s a little harder to spot perhaps, but it’s there. (In essence, we’re returning a function that calls another. The function being called has parameters that come from the closure, and others from the call itself.) Can we extract out this commonality?

It turns out that yes we can. The idea was first shown by Oliver Steele, and it needs us to use the arguments array and a little trick. The little trick goes like this: we pass in a complete set of arguments we need for the wrapped function to the maker function. For those arguments that are fixed, we supply the actual values. For those arguments that are floating, we pass undefined. The function we create then slots in the arguments it’s given into the undefined spaces that were passed to the maker function. Perhaps looking at the call to the maker function for our first example might make this clearer:

var delay10 = makePartialFunction(setTimeout, undefined, 10);
var delay5secs = makePartialFunction(setTimeout, undefined, 5000);

The new maker function, makePartialFunction, takes the function to be wrapped as the first parameter, and then the remaining arguments are the arguments to that function when it’s to be called, with any unknowns passed as undefined.

What might that maker function look like?

var makePartialFunction = function () {
  // the first argument is the function we're wrapping
  var func = arguments[0],
  // get the arguments pseudo-array as a local array
  defaultArgs = Array.prototype.slice.call(arguments);

  return function () {
    var 
      actualArgs = [], // the arguments we'll call with
      arg = 0,
      i;
    // create the arguments for the call
    for (i = 1; (i < defaultArgs.length) && (arg < arguments.length); i++) {
      if (defaultArgs[i] !== undefined) {
        actualArgs.push(defaultArgs[i]);
      }
      else{
        actualArgs.push(arguments[arg]);
        arg += 1;
      }
    }
    // call the wrapped function
    return func.apply(this, actualArgs);
  };
};

First of all we make a copy of the data passed to the maker function: the function to be wrapped and the arguments passed. Then we return a function. This function, when called, constructs an arguments array for the wrapped function, replacing the undefined parts with arguments to its own call. It then calls the wrapped function using the apply invocation.

Sounds groovy, except it won’t work for my second example. Next time we’ll fix that problem.

Now playing:
Chemical Brothers - Setting Sun [Instrumental]
(from The Saint)


JavaScript for C# developers: the Module pattern (part 4) – debugging

(For background, please check out parts 1, 2, and 3 before reading this post.)

StopwatchOne of the problems about private local variables inside closures is that you can’t see them. Well, duh, you might say, that’s what private means after all. Point taken, but there’s one scenario where it would be really nice to be able to check those enclosed (enclosured?) variables: debugging.

Now, I’m not going to get into the whole argument here about debugging versus testing. I’m of the firm belief that you should be writing unit tests as you go along (and I’m only not doing it here because the code I write for these posts is mainly pedagogical in nature). Once you have properly built a supporting unit test structure around your code, it shouldn’t matter what your local variables contain: the unit tests will test your external interfaces and if they work as expected then you’re gold as far as the behavior of your code is concerned. Anyway…

Back to our stopwatch example – the initial simple version. Now, I will admit that, as written, it’s ruddy hard to unit test this little object. That’s because I’m embedding, essentially, a random process in the middle: the current time. To properly test this I should extract out the code snippet that gets the current time (the weird-looking +new Date() statement) and build it in as an external interface that I can then mock. But let’s say I want to debug this code after I’ve called start() and before I call stop(). As written, there’s no way I can inspect the local variable startTime if I’m in a debugger. The stopwatch object is opaque.

Enter the debugger keyword. This is a reserved word in JavaScript and is designed to trigger a program breakpoint when executed. If there is no JavaScript debugger present, it’s just ignored, but if you are running under Firebug or whatever developer tools your browser provides (I use Firefox myself) the debugger will gain control and hand control over to you.

So what, you may say. Whether you have a manual breakpoint or use this fancy debugger statement, you still have the same problem: you can’t see inside the opaque stopwatch object. True. . . unless you execute the debugger statement inside the context of the closure itself.

Let’s add an inspect() method to the stopwatch object:

var stopwatch = (function () {
  var startTime = -1,

  now = function () {
    return +new Date();
  },

  start = function () {
    startTime = now();
  },

  stop = function () {
    if (startTime === -1) {
      return 0;
    }
    var elapsedTime = now() - startTime;
    startTime = -1;
    return elapsedTime;
  },

  inspect = function () {
    debugger;
  };
  
  return {
    start: start,
    stop: stop,
    inspect: inspect
  };
}());

Notice that this new function is inside the closure. If we call stopwatch.inspect() from the Firebug console, for example, the breakpoint will be triggered, and will be triggered inside the context of the closure. We will be able to see the value of startTime (and modify it if required).

In Firebug, you get something that looks like this:

Debugging a closure with debugger statement in Firebug

Now the program execution has stopped inside the closure, you can inspect the value of startTime to your heart’s content.

Inspecting local variable in a closure

Now, I’ll be the first to recognize that this technique is a little invasive – after all, you have to write this special function and expose it publicly – but it does nicely solve the “I can’t inspect the values of the locals inside my closure/module” problem.

Album cover for It Had to Be You... The Great American SongbookNow playing:
Stewart, Rod - These Foolish Things
(from It Had to Be You... The Great American Songbook)


JavaScript for C# developers: the Module Pattern (part 3)

Now that we’ve seen the simple module pattern as well as ways to augment it, we should take a look at one final piece of the puzzle.

StopwatchPrivate fields, as we saw in the last installment, can be a real issue. Sometimes, dammit, we’d just like to refer to that private local variable when we’re augmenting a module object. Just a peek you understand, and we’d make it private again immediately we’re done augmenting, so that the code using the module object doesn’t see this private variable. Unfortunately, given the way we’ve implemented privacy through closures this remains a bit of a problem.

Let’s see what we could do given a blank slate. Let’s create an object called secretData and make its properties the internal data we want to save and to share amongst our augmentation code. Obviously, we’d have to make this a normal property so that other code could use it with the augmentation pattern. Here’s the changed stopwatch:

var stopwatch = (function () {
  var $sd = {},

  start = function () {
    $sd.startTime = $sd.now();
  },

  stop = function () {
    var elapsedTime = $sd.peekElapsedTime();
    $sd.startTime = -1;
    return elapsedTime;
  };

  $sd.startTime = -1;
  $sd.now = function () {
    return +new Date();
  };
  $sd.peekElapsedTime = function () {
    if ($sd.startTime === -1) {
      return 0;
    }
    return $sd.now() - $sd.startTime;
  };

  return {
    start: start,
    stop: stop,
    secretData: $sd
  };
}());

Nothing to it so far: for the first change I declare a local variable called $sd and then add startTime and now() to that object. I also refactored the code a little bit so that there’s a new function that can just calculate the elapsed time. Since we’re going to share this behavior, we might as well isolate the complicated calculation into its own function. The returned object now has a new property called secretData (which is anything but, at the moment) which is a reference to this local variable. The local variable is of course hidden by the closure.

Onto the tight augmentation code:

// Augmented stopwatch
var stopwatch = (function (sw) {
  var $sd = sw.secretData;

  $sd.laptimes = [];
  $sd.oldStop = sw.stop;

  sw.stop = function () {
    sw.lap();
    $sd.oldStop();
  };

  sw.lap = function () {
    $sd.laptimes.push($sd.peekElapsedTime());
  };

  sw.reportLaps = function () {
    var laps = $sd.laptimes;
    $sd.laptimes = [];
    return laps;
  };

  return sw;
}(stopwatch));

I first make a copy of the secretData property and then use that copy throughout for any local variables I need in the closure. Other than that, the code for lap() is much simpler (since I’ve made use of the method that returns the elapsed time), as is the overridden stop() method. The returned object still has the very public secretData property of course.

Now the fun bit. After the module object has been finally augmented, we call:

delete stopwatch.secretData;

Whoa. What this does is to delete the secretData property completely. But, notice something else: both closures created by the module pattern have made copies of the secret object already (they both called this copy $sd). The code in the closures still functions just as before since it makes no reference to the secretData property (except when executing the anonymous function to create the closure in the first place). Admittedly with some shenanigans, we’ve created some shared secret data across several closures.

If you don’t like the call to delete there, just create a method in the original unaugmented code (hideSecretData) that “cleans up” the public references to the secret data and call that instead. This makes it a little more more readable:

var stopwatch = (function () {
  var $sd = {},

  start = function () {
    $sd.startTime = $sd.now();
  },

  stop = function () {
    var elapsedTime = $sd.peekElapsedTime();
    $sd.startTime = -1;
    return elapsedTime;
  },
  
  hideSecretData = function () {
    delete this.secretData;
    delete this.hideSecretData;
  };

  $sd.startTime = -1;
  $sd.now = function () {
    return +new Date();
  };
  $sd.peekElapsedTime = function () {
    if ($sd.startTime === -1) {
      return 0;
    }
    return $sd.now() - $sd.startTime;
  };

  return {
    start: start,
    stop: stop,
    secretData: $sd,
    hideSecretData: hideSecretData
  };
}());

and then all you need to call is this to make the intent of the code clearer:

stopwatch.hideSecretData();

After this, not only will the secret data have been hidden, but also the method that hid it will have vanished.

(Part 1, Part 2 of this series.)

Album cover for MezzanineNow playing:
Massive Attack - Teardrop
(from Mezzanine)


JavaScript for C# developers: the Module Pattern (part 2)

Last time I talked about the simple module pattern. This is where you create a function that returns an object with behavior and state and that behavior and state is implemented (and made private) by using a closure. We showed this by using the module pattern to create a stopwatch object.

StopwatchLet’s now see how we can extend this stopwatch object by adding the facility to have lap times. This gives us the ability to use the same stopwatch to time a sequence of time-consuming actions, rather than creating a new stopwatch every time. The only caveat is that we can’t modify any of the code we’ve already written.

We’ll assume then that we have a new stopwatch object created from the previous code. We need to add a lap() method to record the time since the start or since the last lap. We also need a reportLaps() method that will return an array of lap times. One way to do this would be to create a brand new wrapper object (call it, say, a lapwatch) that uses the already created stopwatch internally as a delegate. In other words, the new lapwatch object would delegate all timings to the internal stopwatch but keep its own lap times. Perfectly doable but it’s hardly extending the original stopwatch object.

What we’ll do is to apply the module pattern again, but this time not to create a new object. Instead we shall augment the existing object. So the first change is to pass in the existing object to our anonymous function.

(function (sw) {

  // some code that operates on sw

}(stopwatch));

We have another auto-executing anonymous function, but this time it takes a single parameter: the stopwatch object. Inside the function, the parameter is known as sw for convenience’ sake. We assume that the as-yet-unwritten code will be modifying sw (and hence the external stopwatch). You could also return sw and re-assign it to stopwatch if you wanted, much like we did previously.

Now we can write the code that provides and reports the lap times.

(function (sw) {
  var laptimes = [];

  sw.lap = function () {
    laptimes.push(sw.stop());
    sw.start();
  };

  sw.reportLaps = function () {
    var laps = laptimes;
    laptimes = [];
    return laps;
  };
}(stopwatch));

As you can see, we have a local array to store the lap times and we add the two new methods to the existing object. The function forms another closure and provides a new private variable. The lap() method is a bit of a hack: since I stipulated that we couldn’t change the original object and since we have no access to the startTime variable, I had to stop the stopwatch to get the elapsed time and then immediately start it again. A better solution perhaps would have been to add a Peek() method to the original code, just so we can see the current elapsed time without stopping the stopwatch.

And here’s some dummy code that exercises this augmented stopwatch:

var i;
stopwatch.start();
for (i = 0; i < 100000; i++);
stopwatch.lap();
for (i = 0; i < 200000; i++);
stopwatch.lap();
for (i = 0; i < 300000; i++);
stopwatch.lap();
console.log(stopwatch.reportLaps());

This pattern is typically known as the Tightly Augmented Module Pattern. We pass in the current object and the function forms another closure over it to modify it. For this code to work we *must* declare the original code first and then this code. If they are in different source code files (the usual case), we must declare the stopwatch code file first, and then this augmented stopwatch code second. That way, the JavaScript interpreter will execute the code in the proper order; the augmentation code won’t get run on an undefined variable.

A slight modification that we can’t really show with our stopwatch example is the Loosely Augmented Module Pattern. With this pattern we’re usually building some kind of utility object or a namespace that contains a whole bunch of other objects that do some work. These other objects don’t require or interact with each other. For example:

var jmbNamepace = (function ($j) {
  $j.date = { ... };
}(jmbNamespace || {}));

--- 

var jmbNamepace = (function ($j) {
  $j.regex = { ... };
}(jmbNamespace || {}));
 
---

var jmbNamepace = (function ($j) {
  $j.url = { ... };
}(jmbNamespace || {}));

Each of these three code segments could be run before any of the others, and each could be omitted, if needed. If they were all in different source files, those source files could be loaded in any order, and only those needed could be loaded. The magic is in the expression jmbNamespace || {}. What this says is “evaluate the expression to be equal to jmbNamespace if it is defined, otherwise evaluate to an empty object”.

Going back to our stopwatch example, notice that there’s something buggy about it. If I’d called stop() on the augmented stopwatch at the end of the last “lap”, the time for it won’t be recorded in the array of lap times. We should allow for this possibility. The way to do this is through an override: we have to override the behavior of stop() if we are using the stopwatch to time laps. Here’s how to do that (and note I’ve changed the definition of the anonymous function to return the augmented stopwatch object to show that this is an equally valid application of the Tightly Augmented Module Pattern):

var stopwatch = (function (sw) {
  var laptimes = [];
  var oldStop = sw.stop;

  sw.stop = function () {
    laptimes.push(oldStop());
  };

  sw.lap = function () {
    sw.stop();
    sw.start();
  };

  sw.reportLaps = function () {
    var laps = laptimes;
    laptimes = [];
    return laps;
  };

  return sw;
}(stopwatch));

The first new thing that happens is that we copy the function object referenced by the stopwatch’s stop() method and save it in the local variable oldStop. (The joys of first-class functions or “functions are objects”!) We then replace the stop() method with a new one that pushes the final lap time onto the array and we call the old stop function to get that final lap time. The lap() method also has to change: we can now just call the stopwatch object’s stop() method to do our work since it’s now rerouted through our override. All these shenanigans are of course hidden from view inside the closure.

Next time we’ll close off this mini-series on the module pattern with some final features.

(Part 1, Part 3 of this series.)

Album cover for The Best of the Art of NoiseNow playing:
Art of Noise - Peter Gunn [The Twang Mix]
(from The Best of the Art of Noise)


JavaScript for C# developers: the Module Pattern (part 1)

If you recall, JavaScript closures are an extremely powerful concept in the language. By using JavaScript’s rather peculiar scoping rules, closures are a way of creating private variables and functionality for an object. The module pattern builds upon this feature.

StopwatchA quick recap on scope in JavaScript may be in order. Scope is the means by which programmers limit the visibility and the lifetime of their variables and parameters. Without scope, all variables would be global and visible everywhere. In C#, scope is introduced by braces: you can declare a new variable inside some code in braces — a block — and that variable would not be visible outside the block, and in fact would be destroyed once execution reaches the closing brace. These braces can be the braces surrounding the code in a method, or they can be the braces delineating a block for an if or a for statement, and so on. The essence of scope in C# is: declare a variable inside a block and it won't be visible outside the braces enclosing the block. (For full details about scope in C# you should read section 3.7 of The C# Programming Language (Fourth Edition) by Hejlsberg et al. There’s roughly eight pages of discussion on scope.)

In JavaScript, there is one basic rule: functions define scope. If you declare a variable in a function, it is visible anywhere inside that function, including inside any nested functions, even before it was declared. It is not visible outside the function. Since nested functions are also variables, they will also be visible anywhere in the enclosing function, even before they are declared. In essence, JavaScript hoists all variables to the top of the enclosing function so that they are declared before they are set with a value. This is like the lexical structure of Pascal with its var blocks, except that this hoisting occurs automatically in JavaScript.

Note that by nested functions, I mean functions nested in a lexical sense not in an execution sense. In other words, if function A calls function B, it doesn’t mean that B can suddenly ‘see’ the variables declared in A. For that to happen, B must be coded within the source code for A. It’s the lexical nesting that provides scope.

Let’s see this in code:

var outerFunc = function (paramOuter) {
  // Visible: paramOuter, varOuter
  //          nestedFunc1, nestedFunc2
  var varOuter = 42;

  var nestedFunc1 = function (paramNested1) {
    // Visible: paramOuter, varOuter
    //          nestedFunc1, nestedFunc2
    //          paramNested1, varNested1,
    //          nestedFunc1inner
    var varNested1 = false;

    var nestedFunc1inner = function () {
      // Visible: paramOuter, varOuter
      //          nestedFunc1, nestedFunc2
      //          paramNested1, varNested1,
      //          nestedFunc1inner, inner1
      var inner1 = "hello";

    };
  };

  var nestedFunc2 = function (paramNested2) {
    // Visible: paramOuter, varOuter
    //          nestedFunc1, nestedFunc2
    //          paramNested2, varNested2
    var varNested2 = false;

  };
};

It’s not exactly an example of some edifying code, but it gets the point across. If you look at the nestedFunc1inner function, you can see that all of the variables from the outer function are in scope as well as all the variables from the immediate parent function as well.

The nifty thing about these scoping rules is how well it plays along with closures. A closure encapsulates a function’s environment so that it can continue to exist as an object even after the function terminates. In C#, you see this most of all with lambda expressions and anonymous delegates, but with JavaScript you see it all over the place. jQuery is, in essence, one gigantic closure.

Let’s suppose we have to write a stopwatch object, which we’re going to use to time sections of our JavaScript code. The object should have a start() method that starts the stopwatch ticking, a stop() method to stop the watch (and which should also return the number of milliseconds elapsed). Without closures we’d probably write something like this:

var stopwatch = {
  startTime: -1,

  now: function () {
    return +new Date();
  },

  start: function () {
    this.startTime = this.now();
  },

  stop: function () {
    if (this.startTime === -1) {
      return 0;
    }
    var elapsedTime = this.now() - this.startTime;
    this.startTime = -1;
    return elapsedTime;
  }
};

Notice that I have a helper method called now() (that construction of +new Date() looks too weird in actual code – what does it do again?) as well as an internal field that records the start time of the stopwatch. Unfortunately although this object nicely encapsulated these two members, they are public, fully visible. We have not hidden them. A user of the stopwatch object could access them, change the variable, replace them, whatever. Also, I hate to say it, but all those required references to this doesn’t half obfuscate the code and make it long-winded.

Enter the module pattern. With the module pattern, we create an anonymous function and execute it automatically. The function will return an object that will be our stopwatch object. Let’s look at this step-by step. First we create the outer anonymous function and auto-execute it. For now we’ll have the function return an empty object.

var stopwatch = (function () {
  // other code to be added here
  return {
    // define some fields
  };
}());

Nothing too drastic so far, I’m sure you’ll agree. Now the fun bit starts. Let’s add back the original stopwatch code, altering it so that it makes sense as local variables and nested functions.

var stopwatch = (function () {
  var startTime = -1,

  now = function () {
    return +new Date();
  },

  start = function () {
    startTime = now();
  },

  stop = function () {
    if (startTime === -1) {
      return 0;
    }
    var elapsedTime = now() - startTime;
    startTime = -1;
    return elapsedTime;
  };

  return {
    // define some fields
  };
}());

As you can see all of those annoying this references have gone since there is no enclosing object any more. Instead we can rely on function scope to resolve references to variables. For example, in the start() function we can reference the outer anonymous function’s startTime local variable. For that matter, the same applies to the stop() function. The final step is to make sure that the returned object has the required two methods, start() and stop().

var stopwatch = (function () {
  var startTime = -1,

  now = function () {
    return +new Date();
  },

  start = function () {
    startTime = now();
  },

  stop = function () {
    if (startTime === -1) {
      return 0;
    }
    var elapsedTime = now() - startTime;
    startTime = -1;
    return elapsedTime;
  };

  return {
    start: start,
    stop: stop
  };
}());

The code that creates the new returned object looks a little weird until you read it as “this new object has a stop property whose value is the internal stop() function object, etc.”

The returned object makes use of the closure formed by the anonymous function. The object’s two properties (methods) will call the nested functions inside the closure and those, in turn, will make use of the outer function’s local variable startTime. We’ve essentially created some private members: a local variable and a now() function. These two members are inaccessible outside of the closure.

It must be admitted that the code as written assumes that our stopwatch object is going to be a singleton; after all, we’re auto-executing an anonymous function and it’s hard to do that twice in a row. If you have need of several stopwatches, then the best thing is to assign that anonymous function to a variable called createStopwatch, and then you can call it ad nauseam to create as many stopwatches as you’d like.

var createStopwatch = function () {
  // same code as before
  return {
    start: start,
    stop: stop
  };
};

var stopwatch1 = createStopwatch();
var stopwatch2 = createStopwatch();

Next time, we’ll take a look at how to augment an object created with the module pattern.

(Part 2, Part 3 of this series.)

Album cover for BoomaniaNow playing:
Boo, Betty - Doin' the do (7" radio mix)
(from Boomania)


JavaScript for C# programmers: getting caught out with closures

Another stop on the road to becoming a JavaScript developer when you know C#. Fire up Firebug in Firefox and follow along.

In this episode we look at some problems we might encounter when using closures.

Recall that, just like anonymous methods in C#, a closure is a binding between a function and the 'environment' in which it's declared. I've been using them a lot in this series, but here's a simple 'counter' example:

var makeCounter = function(start) {
    return {
        next: function() { start++; },
        value: function() { return start; }
    };
};

var counter = makeCounter(42);
counter.next();
console.log(counter.value()); // outputs 43

Nothing too difficult, we've seen many examples just like this before. The makeCounter function takes a single parameter, start, and then returns an object with two methods, next and value. next advances the internal value of the counter and value merely returns its current value. The closure happens because the two functions are referencing the start parameter (that is, a local variable) of the outer function, even after the outer function has terminated. They have both captured start: this is the closure.

You can see this working in the test code: we create a new counter with start value 42, increment it, and then display the current value.

Let's change it so that we return an array of counters:

var makeCounters = function(start, count) {
    var counters = [];
    for (var i = 0; i < count; i++) {
        counters[i] = {
            next: function() { start++; },
            value: function() { return start; }
        };
    }
    return counters;
};

var counters = makeCounters(42, 2);
counters[0].next();
console.log(counters[0].value()); // outputs 43

Not too much has changed, apart from rearranging the code to create the array of counters. The counter objects still have the same form as before (the two methods); all we're doing is defining a new array and then creating as many counter objects in that array as were requested.

Underneath that function, you can see from the test code that it works as before.

Or does it? Add the following test code after the code to test counters[1]:

counters[1].next();
console.log(counters[1].value()); // outputs 44 ???

Something is wrong: the two counter objects in the array are supposed to be independent, and yet they don't seem to be. They seem to be sharing the same captured value.

That is exactly the problem: the closures are not capturing the "current" value of start, they are capturing the actual variable. If one of them makes a change to that captured variable, then the other closures will see the changed value. (In fact, if you look at the original makeCounter, you'll see that we're implicitly assuming this is how it works: both next and value are acting on the same captured variable.)

Now, with the start variable, it's pretty obvious. Let's make it slightly harder to spot the problem by adding an id method to the returned counter objects:

var makeCounters = function(start, count) {
    var counters = [];
    for (var i = 0; i < count; i++) {
        counters[i] = {
            id: function() { return i;},
            next: function() { start++; },
            value: function() { return start; }
        };
    }
    return counters;
};

The id of a counter object is just its position in the array. At least that's what we want it to be. Can you determine by inspection what the following lines will produce?

var counters = makeCounters(42, 2);
console.log(counters[0].id()); 
console.log(counters[1].id()); 

From the discussion we've just had, the answer is obviously not 0, 1. You're doing well if you recognize that they'll both output the same value, and very well if you work out that the value is 2. (Hint: the loop stops when i reaches 2.)

So what to do? We have to isolate the two local variables so that we can capture them separately for each counter object we create. The easiest way to do that is to use another anonymous function and pass the two values in as parameters.

var makeCounters = function(start, count) {
    var counters = [];
    for (var i = 0; i < count; i++) {
        counters[i] = function (start, id) {
            return {
                id: function() { return id;},
                next: function() { start++; },
                value: function() { return start; }
            };
        }(start, i);
    }
    return counters;
};

This is starting to get a little complicated, but bear with me. We're setting the element in the array, not to a function, but to the result of a function we're immediately going to execute. Here's the code with the noisy bits taken out, it'll be easier to see:

counters[i] = function (start, id) {
    // some code
}(start, i);

In other words, we have an anonymous function that takes two parameters called start and id. We immediately call it passing the current value of start for the start parameter, and the current value of i for the id parameter.

Inside this anonymous function, we merely return a new object. The id method returns the value of id passed in, and the other two methods do their stuff on the passed in value of start. Notice that the scoping rules for these methods say that they get their values from the immediate outer function, not from the outer outer function. Their closure is over the nested inner function. They don't "need" any local variables from the outer function and so don't form a closure over it.

The lesson to take away form this is that closures are over local variables, not the current value of those variables. Sometimes it's hard to see that in the thicket of braces and function keywords.

Having assimilated all that, you'll be in a great position (even knowing nothing about jQuery) to say well, duh! to this post.

Album cover for Diamond Life Now playing:
Sade - Why Can't We Live Together
(from Diamond Life)


JavaScript for C# programmers: refactoring the expression evaluator

Another in the series in learning JavaScript from the viewpoint of a C# programmer, using Firebug as our test engine.

In this episode, we take the functioning expression evaluator from the last post and clean it up JavaScript style.

Wrap in an object

The first step is to wrap this global-properties-all-over-the-place code tidily in a single global object. This happens to be quite easy, and I used an auto-executing function to create a single object called expressionEvaluator that has a single method called exec. At the top was this:

var expressionEvaluator = function() {

And then at the bottom of the original code, I replaced the old evaluateExpression method with this code:

    return {
        exec: function(expression) {
            var state = makeParserState(expression);
            var result = parseExpression(state);
            if (result.failed) { return NaN; }
            return result.expression.evaluate();
        }
    };
}();

As you can see the function is automatically executed (the execution () operator on the last line), and it returns an object (using the object literal syntax) that has one property, a method called exec. Everything else in this object is private, hidden from view from the closure created by the auto-executing function; it is a black box, and we are free to modify it completely so long as we retain the exec method.

I can't stress this enough: these patterns (closures and auto-executing functions to create other objects/functions) are very common in JavaScript. You might call these patterns idiomatic JavaScript. I'll agree that they can be hard to spot — heck, it's only until 200 lines later that you realize that expressionEvaluator is not a function after all but an object returned by a function that's executed immediately — but it's worth learning the pattern and how the natives speak.

So with this change we can check off that particular bug: we have one global object and no longer a dozen or so. The code to call it hasn't changed that much, and it's easy to see that everything still works properly:

console.log(expressionEvaluator.exec("1+2")); // outputs 3
console.log(expressionEvaluator.exec("1+2*3")); // outputs 7
console.log(expressionEvaluator.exec("((1+2)*(3-4))+((5*6)-(7/8))")); // outputs 26.125

State machine

Now we have a single object (since the auto-executing function is only run once, there can only ever be a single object, so it's essentially a singleton), we can get rid of the state object (there was only ever one of those too), and place all its data as fields of our evaluator object. This means we can get rid of the getCurrent method and generally clean things up.

    // State maintenance
    var expr, // the original expression string
        ch,   // the current character
        at,   // where we're at in the expression string
        advance = function() {
            ch = expr.charAt(++at);
        },
        skipWhite = function() {
            while (ch && (ch <= ' ')) { advance(); }
        },
        initialize = function(expression) {
            expr = expression;
            at = -1;
            advance();
        };

I've written this as a connected set of properties, connected in the sense of a series of variable declarations separated with commas; it's another JavaScript idiom that you may come across. The reason is that JavaScript is often minimized before use, that is, all comments and unnecessary white space is removed to make the interpreter work more efficiently. Since a comma is one character, it's often used like this instead of having a set of var declarations, which is more verbose. Notice I've added a skipWhite method to skip over any white space.

The first statement in exec now becomes this:

initialize(expression);

...but we're in a load of hurt because everything expects a state object. Nothing works. Let's forge on.

The tokens

Next up is the whole rpnToken thing, with its encapsulation-breaking isToken and isOperator. It's just nasty, people. It's a poor man's way of creating something like the is operator in C#. I hang my head in shame, but at least I could say I did it on purpose so I could show how to refactor it. Yeah, that's the reason. Anyway...

The thing to do here is to push down into the rpnToken object hierarchy the functionality that's being exposed at a higher level. For example, the isOperator function is only used in one place: when an RPN expression is being evaluated:

if (token.isOperator()) {
    var rhs = stack.pop();
    var lhs = stack.pop();
    stack.push(token.evaluate(lhs, rhs));
}
else {
    stack.push(token.value);
}

Better would be to get rid of this altogether, by declaring an evaluate method on the number token, and then making both evaluate functions accept a stack parameter. This way the operator token would pop, pop, calculate, push, and the number token would just push. But, at this higher level, all we'd do is call token.evaluate(stack); and be done with it. Nice one. Even better, we can now create a unaryOperator token as well that will pop, negate, push for unary minus, and nothing much at all for unary plus.

Having done that, let's look at the whole RPN expression thing. I wrote it originally to be an exact representation of how you'd write down an RPN expression: string of simple tokens, some of them numbers, some of them operators. But we don't have to be that literal, I could rephrase the definition to be recursive: an RPN expression is one or more operands, followed by an operator. The operator will determine how many operands there are. An operand could either be a number or another RPN expression.

Whoa. An RPN expression is then merely a token in another RPN expression. To evaluate an RPN expression, we evaluate its operands, and then its operator. This is great: we now have four types of tokens: number, unary operator, binary operator, RPN expression. They will all have an evaluate method.

    // Token hierarchy
    // ...base object
    var rpnToken = {
        evaluate: function(stack) {
            throw {
                type: "Abstract",
                message: "evaluate is an abstract method"
            };
        }
    };
    // ...number object
    var makeNumber = function(value) {
        var token = Object.create(rpnToken);
        token.evaluate = function(stack) { stack.push(value); };
        return token;
    };
    // ...unary operator object
    var makeUnaryOp = function(eval) {
        var token = Object.create(rpnToken);
        token.evaluate = function(stack) {
            stack.push(eval(stack.pop()));
        };
        return token;
    };
    // ...binary operator object
    var makeBinaryOp = function(eval) {
        var token = Object.create(rpnToken);
        token.evaluate = function(stack) {
            var rhs = stack.pop();
            var lhs = stack.pop();
            stack.push(eval(lhs, rhs));
        };
        return token;
    };

    // ...the pre-built operators
    var operator = {
        "unary-": makeUnaryOp(function(value) { return -value; }),
        "unary+": makeUnaryOp(function(value) { return +value; }),
        "+": makeBinaryOp(function(lhs, rhs) { return lhs + rhs; }),
        "-": makeBinaryOp(function(lhs, rhs) { return lhs - rhs; }),
        "*": makeBinaryOp(function(lhs, rhs) { return lhs * rhs; }),
        "/": makeBinaryOp(function(lhs, rhs) { return lhs / rhs; })
    };

I've left out the one for the RPN expression for a moment, but look at how all of these functions create objects that, first, descend from rpnToken and, second, depend implicitly on closures. Taking the number token as an example, the evaluate method uses the value parameter. The only way it gets that is through the closure since the execution of makeNumber will have long been completed by the time evaluate will run. Notice I've also explicitly made rpnToken.evaluate an abstract method by throwing an exception if it's ever called.

The RPN expression creation function is a little special:

    // ...RPN expression object
    var makeExpression = function() {
        var expr = arguments;

        var token = Object.create(rpnToken);
        token.evaluate = function(stack) {
            for (var i = 0; i < expr.length; i++) {
                expr[i].evaluate(stack);
            }
        };
        return token;
    };

It looks like it's been declared so that it accepts no parameters. Not quite. It's going to be called with either two parameters (operand, operator) for a unary operator, or three parameters (operand1, operand2, operator) for a binary operator. To get around this in C#, we'd have to either write overloaded methods or use parameter arrays, but in JavaScript we're merely going to use the arguments array and copy it to a local variable called expr. Later on, in the evaluate method, we're going to evaluate each argument in sequence as I discussed above and we'll use the copied arguments array provided by the closure.

Parsing functions

We're now ready for some parsing action.

    // parse a binary operator (either the adds or the multiplys)
    var parseOperator = function(ops) {
        skipWhite();
        if ((ch === ops[0]) || (ch === ops[1])) {
            var op = operator[ch];
            advance();
            return op;
        }
        return null;
    };

This method parses a binary operator. The operators come in pairs (plus/minus and multiply/divide) so I wrote a generic method that gets passes a two-element array containing the operator characters in question. Notice that now the parsing functions are returning an RPN expression and not one of the success/fail result objects (they're gone, toast). If there's a failure we just pass back null.

Next up is parsing a parenthesized expression:

    // parse a parenthesized expression
    var parseParens = function() {
        advance();

        var result = parseExpression();
        if (!result) { return null; }

        skipWhite();
        if (ch !== ')') { return null; } // missing right paren
        advance();

        return result;
    };

It is assumed here that the function will be called with the state machine already positioned on the opening parenthesis (which is how it's done in the only place it's called), so we can just move past it. This function also has an identifiable error: the missing right parenthesis. Later on, we can hook up an error function here to report that back to the caller of the expression evaluator, but for now we'll just note it as a comment.

Parsing a number:

    // parse a number
    var parseNumber = function() {
        var value = '';
        while (('0' <= ch) && (ch <= '9')) {
            value += ch;
            advance();
        }

        if (ch === '.') {
            value += '.';
            advance();
            while (('0' <= ch) && (ch <= '9')) {
                value += ch;
                advance();
            }
        }

        if (!value) { return null; } // number missing
        var number = +value; // force conversion to number type
        if (isNaN(number)) { return null; } // invalid number
        return makeNumber(number);
    };

Again we have a couple of identifiable errors (a number is expected but is missing, a number couldn't be converted from the string); again points at which we can add an error function.

Parsing a factor:

    // parse a factor (either expression in parentheses or number)
    var parseFactor = function() {
        skipWhite();
        if (ch === '(') { return parseParens(); }
        return parseNumber();
    };

As you can see, when parseParens is called, the state machine is pointing at the open parenthesis.

Now parsing a unary expression (yes, I added them):

    // parse a unary expression (factor or +/- factor)
    var parseUnaryExpr = function() {
        skipWhite();
        if ((ch === '-') || (ch === '+')) {
            var op = operator["unary" + ch];
            advance();
            var operand = parseFactor();
            if (!operand) { return null; }
            return makeExpression(operand, op);
        }
        return parseFactor();
    };

Notice how I index into the operator array to differentiate the unary minus and plus from their binary brethrens. Also I could "throw away" the unary plus here, if I wanted: it has no effect when evaluating an expression. Also note that this is where I call makeExpression with only two parameters for the unary operators.

And finally the remaining parser functions.

    // parse a binary expression (operand operator operand)
    var parseBinaryExpr = function(parseOperand, operators) {
        var operand = parseOperand();
        if (!operand) { return null; }
        var rpn = operand;

        var operator = parseOperator(operators);
        while (operator) {
            operand = parseOperand();
            if (!operand) { return null; }
            rpn = makeExpression(rpn, operand, operator);
            operator = parseOperator(operators);
        }

        return rpn;
    };
    // parse a term (unaryexpression multop unaryexpression)
    var parseTerm = function() {
        return parseBinaryExpr(parseUnaryExpr, ['*', '/']);
    };
    // parse an expression (term addop term)
    parseExpression = function() {
        return parseBinaryExpr(parseTerm, ['+', '-']);
    };

parseBinaryExpr does most of the work: it's called with a function that parses an operand, and with an array containing the binary operators to look out for. makeExpression is called with three parameters here.

After all these changes, we can now look at the refactored exec function:

        exec: function(expression) {
            initialize(expression);

            var result = parseExpression();
            if ((!result) || (ch !== '')) { return NaN; } // badly terminated expression

            var stack = [];
            result.evaluate(stack);
            return stack.pop();
        }

Here we see the final recognizable error where we've parsed the expression but there's still more left (an example would be "(1+2)3"). We also see where the stack gets created, the RPN expression evaluated, leaving the result on the stack, which can then be popped and returned.

Summary

Now that we've seen a non-trivial conversion of a C# project to JavaScript, what conclusions can we derive?

The first thing is closures are important. Really important. You have to understand closures and how they work because you'll see them all over the place. In this small example, we have the large closure that creates the expressionEvaluator object (where only one property is public and everything else, the vast majority of the code in fact, is private and hidden); we have the smaller closures that create the token objects.

Second, classical class hierarchies are not as important as in a classical class language like C#. I had to work hard to even get one: the tokens hierarchy. In fact, we could remove the base token object altogether and the evaluator would work just as well. The reason for this is of course JavaScript only tries to evaluate a property at run-time, there's no need to worry about it at coding time. If every token object has an evaluate method, it doesn't matter whether they're descended from a base object with that method, or whether they all independently define one, JavaScript will just call it. In fact, although this example didn't show it particularly, creating an object with some behavior (such as the expressionEvaluator object) is virtually a zero-energy exercise, compared with C#, where we have to write a full-blown class first.

Third, functions are objects. Create them when you need them, pass them around like candy. In this example, the operator tokens are created with function objects that know how to evaluate the operator.

Fourth, learn the idiom. In this example, I had only a few examples. The first and biggest, is the auto-executing function that creates the whole evaluator object in the first place (those two parentheses at the end are easy to miss when you're scanning code). Second, learn what values evaluate to false in a conditional expression. In this code, I was using the fact that null evaluates to false (an example: if (!operand) instead of writing the more long-winded if (operand === null), or the code in skipWhite which checks for an empty character).

Fifth, avoid global object pollution. My original code had umpteen methods created on the global object, the code in this post, just one. Like any language, global variables and methods are bad.

Album cover for (the rest of) New Order Now playing:
New Order - Confusion [Pump Panel Reconstruction Mix]
(from (the rest of) New Order)


JavaScript for C# programmers: object inheritance (part 2)

Continuing to learn JavaScript from the viewpoint of a die-hard C# programmer, using Firebug as our test engine.

In this episode, we continue writing the expression evaluator. (Please review part 1 before continuing.) You might want to have an extra browser open at the C# code, so you can follow along.

The next thing on the agenda are the result objects. In reality, the way I wrote the original code, there is only one failed result, whereas the successful result contains the RPN expression (or token, come to that) of the bit of the algebraic expression we got to. So let's code that up:

var failedResult = {
    failed: true,
    expression: null
};

var makeSuccessfulResult = function(expression) {
    var result = Object.create(failedResult);
    result.failed = false;
    result.expression = expression;
    return result;
};

Having coded it up, I'm not that happy with it. It seems as if I'm trying way too hard to force a class inheritance model approach to what are, after all, very simple objects. It would work equally well without the call to Object.create in there. I'll ignore my doubts for the moment and forge on.

Next is the parserState object. There's only one of these, so no inheritance semantics to worry about.

var makeParserState = function(expression) {
    var expr = expression;
    var position = -1;
    var current;
    var advancePosition = function() {
        position++;
        if (position >= expression.Length) {
            current = '';
        }
        else {
            current = expression.charAt(position);
        }
        return current;
    };

    advancePosition();

    return {
        getCurrent: function() { return current; },
        advance: function() { return advancePosition(); }
    };
};

I've coded this as a standard closure type function to give the state some privacy. The object that's returned has two public methods, getCurrent and advance, but a whole set of private members from the closure. There's the original expression string that we're going to read through, the current position, the current character, and a method to advance the string pointer and read the next character. (Remember that JavaScript has no character type; a character is represented as a one-character string.)

After I wrote it like this I discovered that charAt will return the empty string if the index is out of bounds; I had my C# hat on, obviously: I was assuming that it would throw an exception. So the private advancePosition method could be rewritten as

    var advancePosition = function() {
        position++;
        current = expression.charAt(position);
        return current;
    };

Now we get to the fun stuff, the actual parsing. In the original C# code, I wrote it all as a static class with static methods, but for now I wrote it in JavaScript as a set of methods. This is certainly not the best way, but I will get to that later.

First, parsing the operators:

var parseAdd = function(state) {
    var current = state.getCurrent();
    if ((current === '+') || (current === '-')) {
        state.advance();
        return makeSuccessfulResult(operator[current]);
    }
    return failedResult;
};

var parseMultiply = function(state) {
    var current = state.getCurrent();
    if ((current === '*') || (current === '/')) {
        state.advance();
        return makeSuccessfulResult(operator[current]);
    }
    return failedResult;
};

Notice something subtle going on. The call to makeSuccessfulResult is being passed a token and not an expression, as the implementation of that method would indicate. Keep that thought that at the back of your mind for now.

Parsing a parenthesized expression:

var parseParenthesizedExpression = function(state) {
    if (state.getCurrent() !== '(') { return failedResult; }
    state.advance();

    var result = parseExpression(state);
    if (result.failed) { return result; }

    if (state.getCurrent() !== ')') { return failedResult; }
    state.advance();

    return result;
};

This makes a call to parseExpression that we haven't written yet, to parse the bits in between the parentheses.

Parsing a number:

var parseNumber = function(state) {
    var current = state.getCurrent();
    var value = '';
    
    while (('0' <= current) && (current <= '9')) {
        value += current;
        current = state.advance();
    }
    
    if (current === '.') {
        value += '.';
        current = state.advance();
        while (('0' <= current) && (current <= '9')) {
            value += current;
            current = state.advance();
        }
    }

    var number = +value; // force conversion to number type
    if (isNaN(number) || !value) { return failedResult; }
    return makeSuccessfulResult(makeNumber(number));
};

This involves a couple of tricky bits, so follow along as I describe them. The value variable is going to be a string that we'll grow with the number-like characters we find in the expression. (Number-like in this respect means the digits and the decimal point — sorry, no internationalization yet). So we first gather all the digits we can. If there's then a decimal point, we add that, and then gather all the digits after the decimal point. Fun bit next: we force the string to be converted into a number type. We do this by using JavaScript's interpreter: we start the expression off with a plus sign. JavaScript will decide that the expression is going to be a number since this plus sign could only be a unary plus operator. We then give it the value string. Since the interpreter is in "making a number" mode, it will convert the string to a number which is what we want. (The alternative is to use parseFloat, or to use something like "1 * value".)

The big problem with this is if we start this method off with the current character not being a digit (say, a letter), then the string-to-number conversion will produce 0 without error. So we fail the parse if either the conversion failed (the value of number will then be NaN) or if the string is empty (remember an empty string is equivalent to false, so !value would evaluate to true).

Parsing a factor is easy, it's either a parenthesized expression or a number:

var parseFactor = function(state) {
    if (state.getCurrent() === '(') {
        return parseParenthesizedExpression(state);
    }
    else {
        return parseNumber(state);
    }
};

Parsing a term and parsing the expression are roughly the same (and I haven't yet extracted out the commonality, like I did with the C# code).

var parseTerm = function(state) {
    var operand = parseFactor(state);
    if (operand.failed) { return operand; }
    var rpn = operand.expression;

    var operator = parseMultiply(state);
    while (!operator.failed) {
        operand = parseFactor(state);
        if (operand.failed) { return operand; }
        rpn = joinRpnParts(rpn, operand.expression, operator.expression); 
        operator = parseMultiply(state);
    }

    return makeSuccessfulResult(rpn);
};

parseExpression = function(state) {
    var operand = parseTerm(state);
    if (operand.failed) { return operand; }
    var rpn = operand.expression;

    var operator = parseAdd(state);
    while (!operator.failed) {
        operand = parseTerm(state);
        if (operand.failed) { return operand; }
        rpn = joinRpnParts(rpn, operand.expression, operator.expression);
        operator = parseAdd(state);
    }

    return makeSuccessfulResult(rpn);
};

Both make use of a routine called joinRpnParts to stitch together the operands and operator postfix style.

var joinRpnParts = function(first, second, operator) {
    var result = makeExpression();

    var addTokens = function(operand) {
        if (operand.isToken) {
            result.add(operand);
        }
        else {
            operand.forEach(function(token) { result.add(token); });
        }
    };

    addTokens(first);
    addTokens(second);
    addTokens(operator);

    return result;
};

makeExpression is a slightly changed version of rpnExpression from last time. I added a isToken field to both the token ancestor and the RPN expression objects so that I could tell them apart. This is where I tried to resolve the token versus expression problem I alluded to before: I thought I was being clever here in using the lack of type safety to help me (and feeling all JavaScripty about it) and then having to hack this "well, is it a token or not" boolean in two different places (which to me says I'm not being JavaScripty enough). We'll sort it out later.

addTokens is just a helper function to add the a part onto the RPN expression. Before you get excited by the forEach call there, it's a function I wrote for the RPN expression object:

var makeExpression = function() {
    var expr = [];
    return {
        isToken: false,

        add: function(node) {
            expr.push(node);
        },

        clear: function() {
            expr = [];
        },

        evaluate: function() {
            var stack = [];
            for (var i = 0; i < expr.length; i++) {
                var token = expr[i];
                if (token.isOperator()) {
                    var rhs = stack.pop();
                    var lhs = stack.pop();
                    stack.push(token.evaluate(lhs, rhs));
                }
                else {
                    stack.push(token.value);
                }
            }
            return stack.pop();
        },

        forEach: function(action) {
            for (var i = 0; i < expr.length; i++) {
                action(expr[i]);
            }
        }
    };
};

As you can see, the forEach method takes an action function to call for each token in the expression array, and that for joinRpnParts just adds it to the expression being stitched together.

Finally we need a function to tie it all together:

var evaluateExpression = function(expression) {
    var state = makeParserState(expression);
    var result = parseExpression(state);
    if (result.failed) { return NaN; }
    return result.expression.evaluate();
};

console.log(evaluateExpression("1+2")); // outputs 3
console.log(evaluateExpression("1+2*3")); // outputs 7
console.log(evaluateExpression("((1+2)*(3-4))+((5*6)-(7/8))")); // outputs 26.125

Now that I've shown you all the code, I can reveal that I'm just not happy about it. Yes, it's a pretty close translation of the C# code to JavaScript (and I did copy/paste code from the C# implementation to help move things along — just a case of removing all the type identifiers, mostly), but it's just not good JavaScript to me.

Let me enumerate the problems as I see them:

  • Global properties and methods everywhere. This should make you worried; it certainly does me.
  • It's too wordy. JavaScript is interpreted at run-time: shorter identifiers will help speed it up. However, shorter identifiers mean it's less legible to us humans, but it should still be possible to reduce it all a bit.
  • There's way too much code duplication. I seem to have some parse methods in pairs, it should be possible to extract out a common method for each pair. The evaluate method for the RPN expression should probably use the new forEach method, and so on.
  • I have a code smell between RPN tokens and expressions: I'm having to work out which is which in a higher method; surely that should be pushed down into the objects themselves.
  • The parser state object seems way overkill (it's a remnant from a back-tracking parser implementation I once did).
  • Ditto the result objects. Plus the failed result object doesn't tell us where the problem occurred.
  • Whitespace removal? Surely since it's easier for us to read "3 + 4" than "3+4", the evaluator should discard white space when it needs to.

So, next time, some major refactoring. We're going to be JavaScripty if it kills us.

Album cover for The Mirror Conspiracy Now playing:
Thievery Corporation - Le Monde
(from The Mirror Conspiracy)


JavaScript for C# programmers: prototypes and privacy

Continuing my series about learning JavaScript when you're a C# programmer, using Firebug in Firebox as our testing ground.

In this episode, overriding, privacy, and class models.

Last time we saw how to create inheritance from JavaScript's constructors and prototypes, the so-called prototypal inheritance. In our example, we ended up with this:

var Point = function(x, y) {
    this.x = x;
    this.y = y;
    return this;
};
Point.prototype.move = function(x, y) {
    this.x += x;
    this.y += y;
};

This code snippet shows a couple of things I'd like to reinforce. First of all, because move is defined on the prototype for Point, it is visible to all objects we create from that constructor. If you like, since the prototype is the template for Point objects, all Point objects will have a move function. Second, the x and y fields are not shared between Point objects, but nevertheless all Point objects will have their own copies of these fields.

The concept of a class as we know it in C# is essentially split between the constructor and the prototype. If some variable (including a function) for an object is defined in the constructor, all objects created from it will have their own copy of that variable. If some variable for an object is declared on the prototype, all objects created using it will share the one copy from the prototype.

Overriding fields

But, note, however, there is a gotcha with that last sentence: it is only true when you read from the object. Let's investigate using my overworked Point example. First I'll declare a new field for the prototype called color, and I'll set it to "Red":

Point.prototype.color = "Red";

Now let's create a couple of points:

var first = new Point(1, 3);
var second = new Point(4, 2);
console.log(first.color);  // outputs Red
console.log(second.color); // outputs Red

When I read the value of first.color, JavaScript will first go to the object and see if it has a property called color. It does not. The interpreter then goes to the object's prototype object to see if it has a property called color (remember how it does this: the object first has a hidden field called constructor that points to Point, and this has the field called prototype, which is the prototype of the original object). It does, so the value of color, "Red", is returned. The same exact process happens when I call second.color. All is good; it makes sense.

What happens if I now set first.color to "Blue". What does second.color return now?

first.color = "Blue";
console.log(second.color);

There's two possible answers, "Red" or "Blue", and it hinges on what happens when first.color is set. Pat yourself on the back if you said "Red" and here's why. The chaining to the prototype only happens on a read operation. If you are writing a value, you will be modifying the object itself. So, since there is no property called color in the first object, JavaScript will create one and set it to "Blue". The common prototype.color property is not changed at all. Hence, when you read second.color, you get the chaining to the prototype operation, and "Red" gets returned.

You are, in effect, overriding a field from the object's prototype. The same thing happens when you set a function with the same name as a function in the prototype: you will override the prototype's function.

Constructing objects with private fields

In playing around with this Point example over the past few articles, I've gone from an object that had private fields but that didn't use a constructor/prototype, to an object that's lost its privacy but does have this notion of classical inheritance. Can we get the privacy back?

Remember that privacy comes from closure. Which is the only function we have that can supply privacy? The constructor. Here's a version that implements x and y as private variables:

var Point = function(x, y) {
    this.getX = function() { return x; };
    this.getY = function() { return y; };
    this.setX = function(value) { x = value; };
    this.setY = function(value) { y = value; };
    return this;
};
Point.prototype.move = function(x, y) {
    this.setx(this.getX() + x);
    this.sety(this.getY() + y);
};

Oh wow, it's suddenly a lot more complicated. The first thing to realize is that x and y are parameters to the constructor function, so they are automatically local variables and therefore private. Even more restrictive, since they are local to the constructor, they can't be seen outside the constructor, in particular by the common properties and methods of the prototype. So, the prototype's move method can't reference x and y (at least not without getting an undefined error). We have to therefore write some functions that are public and that can reference the private variables. Those functions by necessity must be defined in the constructor. And so I defined a set of getters and setters.

Note that these getter and setter functions are defined in each object, and not on the prototype. This means we're duplicating the code, but there's no way round it. It also means they can't participate in inheritance: they're defined on individual objects.

Notice also that the getters and setters are public. We are declaring them on the newly created object, and so the move method can make use of them. Douglas Crockford (the inventor of private properties in JavaScript) calls these kinds of functions privileged. A privileged function is a public method on an object that can get at the private data of the object.

Defining descendants

We've now gone into some depth about how to "do" classes by defining a constructor and by defining properties and methods on the constructor's prototype. We can define, in essence, a template for stamping out a whole set of similar objects and we know how to override properties and methods in our newly created objects. But what about further inheritance, building up a class model?

Suppose, now that we have a Point "class", we need a descendant, a ColoredPoint class, which knows its color. How is that done? We obviously need a new constructor, but that doesn't define the inheritance pattern, it's the prototype of that constructor that does. (We'll go back here to our non-privatized version of the Point "class" to avoid the noisy getters and setters.)

var ColoredPoint = function(x, y, color) {
    this.x = x;
    this.y = y;
    this.color = color;
    return this;
};
ColoredPoint.prototype = new Point(0, 0);

Notice that I've thrown away the automatic prototype object that's created with the constructor and replaced it with a fresh new Point object. I'm not particularly bothered about what values I pass when constructing this Point object, I won't be using them.

var coloredPoint = new ColoredPoint(1, 3, "Red");
coloredPoint.move(2, 2);

console.log("Point = (" +
        coloredPoint.x + "," +
        coloredPoint.y + "," +
        coloredPoint.color + ")");

Here I'm constructing a new ColoredPoint object and then I call move on it. What happens here? First, the coloredPoint object has no method called move. So JavaScript goes to the prototype object. The prototype object doesn't have a move method either. So JavaScript goes to its prototype, finally, which does have a move method. Notice how the process continues down the prototype chain, and that at the end, the function is called on the original object (which defines the value of this). The output therefore is Point = (3,5,Red) which is what we wanted and expected.

Next time, we'll throw all that away. We don't need this lookalike classical class model. Let's embrace objects!

Album cover for Zenyatta Mondatta Now playing:
Police - Voices Inside My Head
(from Zenyatta Mondatta)


JavaScript for C# programmers: closure basics

Continuing the series on JavaScript for C# developers, in this episode we look at closure.

First the definition. A closure is a function that encloses its local environment when it is created.

Now, in C#, the function talked about in the definition above is a lambda expression or an anonymous method. The language definition for C# talks about the outer variables of an anonymous method, these are the local variables and parameters that are in the same scope as the anonymous method (note that this is such a variable). The anonymous method is said to capture a local variable if its code makes reference to that variable. The variable, if it is captured, may have a lifetime that is longer than you'd expect from the outer enclosing scope. The function is a closure that captures these local variables.

That all sounds terribly dry, so let's take a look at an example. We'll create a counter function.

    // definition of a method that returns an int
    delegate int Counter(); 

    // method that creates a counter function
    static Counter CreateCounter(int startValue) {
      return () => startValue++;
    }

The first statement defines a delegate that returns an int. The CreateCounter() method creates such a delegate by using a lambda expression. The expression returns the current value of startValue, and post-increments it. The CreateCounter() method then returns this lambda expression. If you're puzzled by the lambda expression, here's the equivalent using a "proper" anonymous method:

    // method that creates a counter function
    static Counter CreateCounter(int startValue) {
      return delegate {
        return startValue++;
      };
    }

The important thing to remember here is that startValue, because it has been captured by the anonymous method, does not get destroyed when the CreateCounter() method terminates, as would usually happen. The anonymous method has captured the parameter, and so it lives on.

Let's use it now:

  Counter counter = CreateCounter(42);
  Console.WriteLine(counter());
  Console.WriteLine(counter());
  Console.WriteLine(counter());

And, when run, as you'd expect this produces:

42
43
44

So a closure in C# is nothing more than an anonymous method or lambda expression that captures one or more outer local variables or parameters.

And it's the same in JavaScript. Well, apart from the different rules of scope, that is. In fact, closures are the way you implement private variables for objects in JavaScript, but we'll get to that later. First of all, since you're familiar with the C# version of the counter, let's see how it looks in JavaScript. First, there's no point in defining the delegate type, so we'll get rid of that. Here's the createCounter function:

var createCounter = function(startValue) {
    return function() {
        return startValue++
    };
};

So we're defining a variable called createCounter. This is a function taking a single parameter, startValue, and that returns another function. This second function returns the current value of startValue and post-increments it. Notice that, under the scope rules of JavaScript, the inner function has access to the startValue parameter from the outer function, so this works as you'd expect. (And funnily enough it's very similar to the anonymous delegate version above.)

The code that executes this is virtually the same, and produces the same output in Firebug:

var counter = createCounter(42);
console.log(counter());
console.log(counter());
console.log(counter());

In other words the inner anonymous function captured the parameter from the outer function and thereby formed a closure.

Let's go a little further now and create a point object, one that has two properties, X and Y. If we were doing this naively we'd write this

var point = {X: 0, Y: 0};

Notice that we have full access to the internal fields of the object. We'd like to change that so that we can only read/write these values through property accessors. How is this done?

The answer involves a closure. Before looking at this answer, think about how you'd implement it for a moment: we want to declare a function (let's call it createPoint) that returns an object with four methods: getX, setX, getY, setY. These methods will do what you'd expect: they're getters and setters for some internal private fields.

Let's build it up, step by step.

var createPoint = function() {
    return {
        getX: function() { },
        setX: function(value) { },
        getY: function() { },
        setY: function(value) { }
    };
};

This is the bare minimum. The return statement creates a new anonymous object with the four methods, and these four methods do absolutely nothing at the moment. But note that the setters at least take a value parameter whereas the getters do not.

What we'd like to happen is that there's a private field called x and the getter for the X coordinate returns the value of that field, and the setter sets it. Ditto for the Y coordinate. Without worrying about the "private" bit for now, since we don't know how to implement it (there's no private keyword), let's complete the createPoint function in the most obvious — that is, C#-ish — way:

var createPoint = function() {
    return {
        x: 0,
        y: 0,
        getX: function() { return x; },
        setX: function(value) { x = value; },
        getY: function() { return y; },
        setY: function(value) { y = value; }
    };
};

This returns an object with two fields and four methods. Unfortunately the fields are publicly visible. I can write this:

var point = createPoint();
point.x = 1;
point.y = 2;

Even worse, the methods are broken. As they stand the x and y they refer to are not the variables in the object, but global variables. (Remember: JavaScript scope is by function, not by block or by object. The getX method above doesn't have an x local variable, so JavaScript goes and takes a look at the outer function. This doesn't have an x function either. Since there is no outer outer function, x is assumed to be a global object and is logically prefixed with "window.". But window doesn't have an x variable either, and so getX() will fail.) To fix the methods, I'd have to prefix the references to the variables with this, something I haven't discussed yet. All in all, not a good solution.

However, it's pretty close. If I changed it to this:

var createPoint = function() {
    var x = 0;
    var y = 0;
    return {
        getX: function() { return x; },
        setX: function(value) { x = value; },
        getY: function() { return y; },
        setY: function(value) { y = value; }
    };
};

Everything would suddenly work, and both x and y are invisible to the outside world; they are private. Let's see why.

First of all, notice that x and y are local variables in the createPoint function. That automatically means they're not visible outside the function (that's what a local variable means, after all), and also, being JavaScript, that they're visible to all other functions declared inside the function as well. We have four such functions, the getters and setters — it doesn't matter if they're declared in another object like these are: scope in JavaScript is by function, remember. So, for example, getX will return the value of x. Which x? Well, the one it can find in the scope chain, which is the one declared in the outer function.

Each of these getters and setters are closures over the two local variables. Each closure shares the same captured local variables: if you set the X coordinate with setX, you'll retrieve the same value with getX. The code below tests this important point (pun intended). First of all it creates a new point variable, sets both the X and Y coordinates, and then prints the value of the point variable, that is, the coordinate pair.

var point = createPoint();
point.setX(1);
point.setY(2);
console.log("Point = (" + point.getX() + "," + point.getY() + ")");

As you can see from this, providing you declare an object through calling a function, you can get private variables. Encapsulation, to give it its real name. And the usual way of creating an object by calling a function? Construct it; that is, use a constructor. Next time, then, we'll look at classes. Er, sorry, prototypes.

Album cover for Roxy Music Now playing:
Roxy Music - 2HB
(from Roxy Music)


Search

About Me

I'm Julian M Bucknall, the M because it's my middle initial and because I and the other Julian Bucknall (the movie guy) would like to differentiate ourselves.

I'm a programmer by trade, an actor by ambition, and an algorithms guy by osmosis. I write articles for PCPlus in my spare time, not that there's much of that.

Julian M Bucknall Apart from that, an ex-pat Brit, atheist, microbrew enthusiast, Pet Shop Boys fanboy, slide rule and HP calculator collector, amateur photographer, Altoids muncher.

DevExpress

I'm Chief Technology Officer at Developer Express, a software company that writes some great controls and tools for .NET and Delphi. I'm responsible for the technology oversight and vision of the company.

Validation

Validate markup as HTML5 (beta)     Validate CSS

Bottom swirl

Archives

February 2012 (4)
SMTWTFS
« Jan  
1234
567891011
12131415161718
19202122232425
26272829

Like this Archive Calendar widget? Download it here.

Social networking

Google ads

The OUT Campaign

The OUT Campaign

My Tweets

Bottom swirl