ariya.io About Talks Articles

On-the-fly JavaScript Syntax Node Inspection

3 min read

A common approach to analyze JavaScript source statically is to parse the source into an abstract syntax tree (AST) and then to traverse the AST. An alternative approach that might work in a few cases is to inspect each syntax node as it is constructed.

A powerful feature available in Esprima since version 3.0 is the ability to invoke a callback function after every syntax node in the abstract syntax tree is created, often referred as the syntax delegate. If you are familiar with map for arrays, then this concept is not strange to you. In the case of Esprima, the argument passed to the callback function is the corresponding syntax node.

The simplest illustration of this feature is the following script for Node.js:

var esprima = require('esprima');
esprima.parse('answer = 42', {}, function(n) {
  console.log(n.type);
});

After pulling Esprima module (npm install esprima), running the script gives the output:

Identifier
Literal
AssignmentExpression
ExpressionStatement
Program

Our callback function, which does nothing but to print the node’s type, was called five times. The first two calls were with the deepest nodes (which were also leaves), an Identifier node representing answer and a Literal node for 42. The next was an AssignmentExpression node that combines the previous two nodes (answer = 42). It was then followed by the only statement in the code fragment, an ExpressionStatement node. The last one was (always) the top-level Program node.

If you pay attention, this is essentially very similiar to the depth-first traversal algorithm. In this case however, we do not wait until the AST is completely constructed. Heck, we do not even save the result of parse at all.

A practical use of the syntax delegate is to detect the presence of a particular node. Let us assume you want to ban the use of any ternary operator. In other words, if there is such a following file bad.js:

var hour = 9;
var msg = (hour < 12) ? 'morning' : 'afternoon';
console.log('Good ' + msg + '!');

you want to be warned because the second line there utilizes the ternary operator. In the AST produced by Esprima, a ternary operator is represented by a node of the type ConditionalExpression (try it yourself with the online parser demo). Thus, all we have to do is to construct a callback function that looks for such a node. Here is a possible implementation (yes, it is only 8 lines of code!):

var fs = require('fs');
var esprima = require('esprima');
var contents = fs.readFileSync(process.argv[2], 'utf-8');
esprima.parse(contents, {}, function (node, meta) {
  if (node.type === 'ConditionalExpression') {
    console.log('Ternary at line', meta.start.line);
  }
});

delegate

If the above script is named warn-ternary.js, it can be invoked with Node.js as follows:

$ node warn-ternary.js bad.js
Ternary at line 2

What is new here is the second argument to the callback function. It contains the metadata of the node. The most important metadata is the starting and end location of the node. Hence, it is possible to locate the line number that contains the ternary operator, using meta.start.line.

These are two simple examples to get you started. If you want to further explore this powerful feature of Esprima, I highly recommend launching your favorite Node.js debugger, follow the code execution, and inspect the way the callback function is invoked. Who knows, perhaps you will come up with a fantastic tool for our beloved JavaScript!

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook