About Talks Articles

JavaScript Identifier Length Distribution

2 min read

After the fun distribution charts of statements and keywords in popular JavaScript libraries, it is time for another metrics analysis. For a while, I was wondering how JavaScript developers come up with a variable name, function name, and other identifiers. Is it just few characters? Is it not that short? Is it always descriptive? The following script idlen.js (to be executed with Node.js) uses the parser from Esprima to dump all the identifiers, excluding the duplicates, of each file in its its corpus of libraries (for the benchmark suite).

var fs = require('fs'),
    esprima = require('esprima'),
    files = process.argv.splice(2);
files.forEach(function (filename) {
    var identifiers = {},
        content = fs.readFileSync(filename, 'utf-8'),
        syntax = esprima.parse(content);
    JSON.stringify(syntax, function (key, value) {
        if (key === 'name' && typeof identifiers[value] === 'undefined') {
            identifiers[value] = value.length;
        return value;
    for (var key in identifiers) {
        if (identifiers.hasOwnProperty(key)) {

With the help of Unix tools:

node idlen.js /path/to/some/*.js | sort -n | uniq -c

the distribution will look like the following diagram:

There is a long tail from 15 characters and above, which makes sense since an identifier that long will be likely special cases only (excluding this long tail region, the data roughly follows the expected normal distribution). The actual mean of the identifier length is 8.27 characters.

For the fun of it, the top 5 longest identifiers found among the libraries, with over 34 characters, are:

     jquery-1.7.1.js   subtractsBorderForOverflowNotVisible   getClosestElementWithVirtualBinding

What kind of distribution do you get for your own JavaScript project?

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook Google+

comments powered by Disqus