When I looked at some of the functions in the core parser of Esprima, I usually had one or two ideas on how to improve the readability. A week ago my concern was a pair of function important for parsing objects in the literal format. It’s a heavily branched code using a switch statement, as illustrated in the code for one of them:
function parseObjectPropertyKey() {
var token = lex(),
key;
switch (token.type) {
case Token.StringLiteral:
case Token.NumericLiteral:
if (strict && token.octal) {
throwError(token, Messages.StrictOctalLiteral);
}
key = createLiteral(token);
break;
case Token.Identifier:
case Token.Keyword:
case Token.BooleanLiteral:
case Token.NullLiteral:
key = {
type: Syntax.Identifier,
name: token.value
};
break;
default:
}
return key;
}
Somehow my poor brain cells believed that the above construct can be simplified. I immediately thought of using an if statement but then many JavaScript optimization techniques often mention that the switch statement is a better approach than the branching via the if statement. My gut feeling said that for this particular case, if vs switch does not matter much. The question in my mind was, how can I know this for sure? That simple question turned my evening into a long one.
My journey started after I got myself a fresh build of V8 bleeding-edge, version 3.10. I also crafted a minimalistic script which loads every test fixture in the existing benchmark corpus and passes the content to the parser. After a few false starts, finally running the debugger shell with its various tracing options, most important in this context is --trace-bailout
, suddenly gave me the answer (check also the post from Florian Loitsch about bail-out and other related V8 flags):
Bailout in HGraphBuilder: @"parseObjectPropertyKey": SwitchStatement:
non-literal switch label
In plain English, this is what happens. The high-level optimizer, part of V8 Hydrogen (H stands for high-level, its low-level counterpart is called Lithium), stops trying to “understand” parseObjectPropertyKey()
function as it trips and bails out on one particular condition: the switch has one or more cases which are not small integers (smi) or strings. If you look back at the code of that function, the labels for each case are Token.StringLiteral
, Token.NumericLiteral
, etc. They are actually just integer numbers, here I abuse JavaScript object as a form of enumeration.
Once the problem is known, at least two possible solutions are available. First option is to use actual integer constants instead of fake enums, e.g.
switch (token.type) {
case 8: // Token.StringLiteral
case 6: // Token.NumericLiteral
....
}
I personally don’t like how it looks. Another alternative is to use an if statement, something that I originally wanted to do anyway (for the sake of readability). The function now looks like:
function parseObjectPropertyKey() {
var token = lex();
if (token.type === Token.StringLiteral || token.type === Token.NumericLiteral) {
if (strict && token.octal) {
throwError(token, Messages.StrictOctalLiteral);
}
return createLiteral(token);
}
return {
type: Syntax.Identifier,
name: token.value
};
}
After this change, when I traced the bail out possibility again, V8 did not complain. I also modified the construct of the companion parseObjectProperty()
function to follow the similar pattern. All is good!
But how about the overall speed? Running the full benchmark suite with the updated version of the code does not show any noticeable speed-up or slow-down. This is to be expected. After all, my original intention was to improve the code clarity without affecting the performance. The reason for no radical speed difference is very simple. First, the function is not executed too many times. When the parser consumes jQuery source, it hits parseObjectPropertyKey()
function only about 600 times. Compare this to parsePrimaryExpression()
which gets executed more than 11 thousands time. Second, the task carried out by this function is extremely simple, there is no heavy computation or back-breaking work.
For completeness, and also emphasized by the last point, the standard optimization disclaimer follows. Unless you are extremely sure that the switch statement is quite complex and being hit a gazillion times in a typical circumstance, do not hope that changing it into another way of branching will magically make it faster. Take into account that different JavaScript engines might behave differently and therefore verify your theory with various JavaScript environments. Even two different versions of the same engine can show two different results, e.g. the above bail out condition I stumbled upon may just become obsolete in the future.
As a closing trivia, you might notice that V8 has two different optimizers: high-level and low-level, referred in source code as Hydrogen and Lithium (notice the initial letters, H and L). There can be many reasons why these names are picked. My own fictionalized backstory is simple: the compound Lithium hybride (LiH) …
…was once tested as a fuel component in a model rocket.
In any case, that long evening I had was educational and entertaining. That matters!