ariya.io About Talks Articles

Cloud PhantomJS with IronWorker

3 min read

PhantomJS is cloud-friendly as it runs well on Amazon EC2, Heroku, and other server platforms. But what if you want to use PhantomJS without the hassle of setting up and maintaining a server? The solution is to use IronWorker, an elastic task queue service.

As a service from Iron.io, IronWorker permits running background tasks with very minimal setup and reasonable pricing. In the latest blog post from the team, Serverless PhantomJS with IronWorker, there were already step-by-step instructions on how to run PhantomJS-based tasks with IronWorker. In this blog, I present another complementary quick tutorial which shows the (deadly) combination of PhantomJS and IronWorker. The basic code is taken from IronWorker examples repository.

First of all, we need the command-line tool to easily manage any IronWorker tasks. This is available as a Ruby gem called iron_worker_ng, thus it’s rather easy to install:

sudo gem install iron_worker_ng

After that, sign up for an account. The free plan gives you 5 hours/month service with IronWorker. After you sign in, create a new project and give it a name. After that, you need to find the project ID to retrieve the authentication info. Look for the link Download json file because this simplifies the step (no need for manual copy and paste). Save that JSON to a working directory, rename it to .iron.json and now we are ready to rock.

The worker itself is rather simple. Create a file phantom.worker with the following contents:

runtime "binary"
exec "run.sh"
file "task.js"
remote_build_command 'curl http://phantomjs.googlecode.com/files/phantomjs-1.6.1-linux-x86_64-dynamic.tar.bz2 -o p.tar.bz2 && tar xf p.tar.bz2 && rm p.tar.bz2'

The actual task is task.js (this is the PhantomJS script containing the task you want to carry out) which will get executed by run.sh:

phantomjs-1.6.1-linux-x86_64-dynamic/bin/phantomjs task.js

Let’s have a simple task to run in task.js:

var account = 'PhantomJS';
var page = require('webpage').create();
 
page.settings.loadImages = false;
 
page.open('http://mobile.twitter.com/' + account, function (status) {
  if (status === 'fail') {
      console.log('Error');
  } else {
    var num = page.evaluate(function () {
      var selector = 'div.profile td.stat.stat-last div.statnum';
      return document.querySelector(selector).innerText;
    });
    console.log('@' + account, 'has', num, 'followers');
  }
  phantom.exit();
});

This is a simple script which tells you the number of Twitter followers for a specific account, in this case it’s for @PhantomJS, the official Twitter account for PhantomJS. Run it first locally with your installation of PhantomJS to make sure that the script works.

Once you have these 3 files, phantom.worker, run.sh, and task.js, now all you have to do (from the console) to setup the task is as easy as:

iron_worker upload phantom
iron_worker queue phantom

Give it a few seconds (or minutes) and then check your project in Iron.io dashboard (look in the Tasks tab). You should see the report that the task has been executed, along with the result (log). Very simple, isn’t it?

Note that you can also take advantage of the scheduling power of IronWorker. For example, you can create a time-based repeated task. There is also a possibility to queue a task at a very specific time. The sky’s the limit.

In the mean time, I’m back to work on my project proposal to Stark Industries…

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook