ariya.io About Talks Articles

PhantomJS: Minimalistic Headless WebKit-based JavaScript-driven Tool

4 min read

PhantomJS is a headless WebKit packaged as a JavaScript-driven tool. It can be used in command-line utilities which requires web stack, or even as the basis for testing rich web application. It uses WebKit in a headless mode, so you get access to the real native and fast implementation (not a simulated environment) of various standards such as DOM, CSS selector, Canvas, SVG, and many others.

The project page contains a bunch of examples, from easy ones to some more complicated uses. Feel free to contribute more examples!

Let’s look at one of the examples, the page rasterizer (yes, it’s only 16 lines!):

if (phantom.state.length === ) {
    if (phantom.args.length !== 2) {
        console.log('Usage: rasterize.js URL filename');
        phantom.exit();
    } else {
        var address = phantom.args[];
        phantom.state = 'rasterize';
        phantom.viewportSize = { width: 600, height: 600 };
        phantom.open(address);
    }
} else {
    var output = phantom.args[1];
    phantom.sleep(200);
    phantom.render(output);
    phantom.exit();
}

If I want to have the famous PostScript tiger from its SVG source, all I have to do is to run:

phantomjs rasterize.js http://ariya.github.io/svg/tiger.svg tiger.png

But static vector graphic is boring. Replacing the above with

phantomjs rasterize.js http://raphaeljs.com/polar-clock.html clock.png

gives me Polar Clock, one notable example from RaphaelJS.

Polar Clock

Should you need to deal with JSONP, process XML, and integrate with YQL, that’s all easily done. Again, refer to the various service integration examples. Let me show one example, which is actually my favorite:

if (phantom.state.length === ) {
    var origin, dest;
    if (phantom.args.length < 2) {
        console.log('Usage: direction.js origin destination');
        console.log('Example: direction.js "San Diego" "Palo Alto"');
        phantom.exit(1);
    }
    origin = phantom.args[];
    dest = phantom.args[1];
    phantom.state = origin + ' to ' + dest;
    phantom.open(encodeURI('http://maps.googleapis.com/maps/api/directions/xml?origin='
        + origin +  '&destination;=' + dest +
        '&units;=imperial&mode;=driving&sensor;=false'));
} else {
    if (phantom.loadStatus === 'fail') {
        console.log('Unable to access network');
    } else {
        var steps;
        steps = phantom.content.match(/<html_instructions>(.*)/ig);
        if (steps == null) {
            console.log('No data available for ' + phantom.state);
        } else {
            steps.forEach(function (ins) {
                ins = ins.replace(/</ig, '< ').replace(/>/ig, '>');
                ins = ins.replace(/<div>ig, 'n<div ');
                ins = ins.replace(</div>< .*?>/g, '');
                console.log(ins);
            });
        }
    }
    phantom.exit();
}</div></div>

If I run it like the following:

phantomjs direction.js 'Redwood City' 'Sunnyvale'

what I got is the complete driving direction:

Head east on Broadway toward El Camino Real
Take the 1st left onto El Camino Real
Turn right at Whipple Ave
Slight right to merge onto US-101 S toward San Jose
Take exit 398B to merge onto CA-85 S toward Santa Cruz/Cupertino
Take exit 22A to merge onto CA-82 S/E El Camino Real toward Sunnyvale
Destination will be on the right

Map data ©2011 Google

Make sure you check out other examples, such as getting weather forecast conditions, finding pizza in New York, looking up approximate location based on IP address, pulling the list of seasonal food, displaying tweets, and many others.

Headless execution of any web content also enables fast unit testing. Obviously, the goal is not to replace comprehensive, cross-browser framework such as Selenium or Squish for Web. Rather, it serves a quick sanity check just before you check in some changes.

Since this can happen automatically and does not need to launch any browser, even better, you can hook the test so that it executes right before a commit and actually prevents the commit if any of the test fails. It is easily done using git via its hook support. This is something I have written at Sencha blog. It demonstrated precommit hook with Jasmine, but technically it can work with any test framework.

I have been working on and off on PhantomJS for the past few years. You may be already familiar with some of its inspiration (also involving headless WebKit): SVG rasterizer, page capture, visual Google, etc.

Finally I managed to overcome my laziness, cleaned up the code, and published it for your pleasure. Obviously it’s not a surprise if you find out that PhantomJS uses QtWebKit.

I got a few tasks for next PhantomJS version 1.1. You are encouraged to file bugs and feature requests in the said issue tracker.

Get it while it is hot!

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook