Introducing Esprima: Blazing-fast JavaScript Parser

In a nutshell, Esprima (esprima.org) is a JavaScript parser written in pure JavaScript. In the near future, it will expand itself to something even more cooler, but as of now it’s just a parser. It uses the common recursive descent approach. The main parsing routine is not machine generated, everything is written by hand. The output of the parser is a syntax tree in JSON, formatted compatible to Mozilla Parser API.

The code is designed to be educational (no funky obfuscated tricks only a JavaScript ninja can decipher), self explanatory (the terminologies match the actual official 258-page specification), and high performant (it can tear apart jQuery source code, and not the minified version, in less than 0.1 sec). It’s always challenging to pick the sweet spot which nails all these three objectives, though I hope Esprima hits an optimal compromise.

Like any complex parser, unit testing is an integral part of the development. To ensure faithful compatibility with Mozilla Parser API, hundreds of its tests have been imported as well. All in all, there are over a thousand tests. In addition, there is a benchmarks suite, it consists of most common JavaScript libraries out there. The performance of various web browsers running the benchmarks suite is depicted in the following chart (shorter is better). The test machine is an iMac from late 2010, with 3 GHz Intel Core i3.

If you think it’s not fast enough, wait for the improvements being made to major JavaScript engines out there. Preliminary tests showed that V8 engine in Chrome 17 (dev channel) executes the benchmarks suite 1.7 faster than Chrome 15. Related to that, JavaScriptCore in WebKit nightly speeds up the benchmark running time by 25% (and it keeps getting faster). In addition, Firefox 9 will feature type inference which shows 65% performance win when running the said benchmarks suite.

What about mobile devices? As expected, it’s rather slower at this kind of job, limited pretty much by the CPU power. Some data of the running time for the benchmarks suite: 5.8 sec for Amazon Kindle Fire, 7.9 sec for Apple iPad 2, 12.8 sec for Nexus S, and 17.9 sec for Nokia N9.

Since Esprima is written in JavaScript, it runs wherever there is a decent implementation of JavaScript. Supported browsers are (among others) IE 8+, Firefox 3.5+, Safari 4+, Chrome 7+, and Opera 10.5+. As expected, Esprima can also be used in Node.js applications by installing esprima package using npm.

The best way to try Esprima is right in the browser via the online syntax parser demo. Type in your code, and voila! Esprima will show you the corresponding syntax tree almost right away. There is also the operator precedence demo, inspired by previously similar demo. Beside comparing if an expressions is equivalent to another one, the example also rewrites your expression as if you would have written it using brackets to enforce the intended precedence, illustrated in the following screenshot:

Compared to other parsers, Esprima is one of the fastest. There is a whole speed comparison page which puts Esprima head-to-head against parse-js (famously known as part of UglifyJS), ZeParser, and Narcissus. Since Esprima does not output location information yet (see issue #6), like ZeParser and Narcissus, a pure speed benchmark is only fair between Esprima vs parse-js. Here is the result, tested with different (stable version) browsers. Still not impressed? With the upcoming Chrome 17, Esprima will be actually 2x faster than parse-js.

So which parser should you pick? Narcissus has been around for a while so its stability and correctness are well tested. It does also support various JavaScript extensions, as well as features from ES.next. Both ZeParser and parse-js are not necessarily new anymore so they are more battle hardened than Esprima. Since the excellent minifier UglifyJS is based on parse-js, I’m not shocked if there are tons of peculiar JavaScript syntax which parse-js can handle really well. At the end of the day, I still hope that as the new kid on the block, Esprima is attractive enough since it’s readable, easy of follow, heavily unit tested, and yet carrying out the parsing task at blazing speed. Thus, if you feel adventurous, give Esprima a try!

Beside dealing with code parsing, Esprima also has the ability to optionally collect the comments (see issue #71). Since it involves some extra steps, expect some minor performance penalty if you do that. Once those comments are extracted, a bit of additional cross reference will allow you to associate certain comment blocks with parts of the code. This is extremely valuable for an automatic documentation tool.

To keep an eye on Esprima development, go to its project page, watch the issue tracker for future plan, and join the discussion in the mailing list.

Get the code and express yourself!

P.S.: Special thanks to Thomas Aylott, Yusuke Suzuki, and Axel Rauschmayer for the useful initial discussion, suggestions, and feedback.

Introducing Esprima: Blazing-fast JavaScript Parser

Related posts: