ariya.io About Talks Articles

Web Page Clipping with PhantomJS

2 min read

One of the major usages of PhantomJS is to capture web pages and render them as image. There are many rendering aspects which can be tweaked, the most popular one is the zoom factor, particularly useful to create thumbnails. A rather not-so-known parameter is the clipping rectangle which is very handy to limit the capture to a selected area only.

A common use of the clipping rectangle is when you need to track the position and the size of a particular element (likely via getBoundingClientRect) and render only that element. This is also what CasperJS offers in its convenient function captureSelector.

Another variant is to produce the semi-paginated version of a web page. Mostly on mobile, the user needs to scroll up and down (typically) and getting the feeling of the continuity of the contents is quite important. For example, the very first “page” needs to show a complete intro. If an important central image is cut into half, that does not give a good impression.

The following images shows the approximate rendering of BBC web site on mobile which looks really good: a full intro followed by a bunch of story summaries.

pageclip

The script used to generated the two images above is shown here.

var page = require('webpage').create();
page.settings.userAgent = 'WebKit/534.46 Mobile/9A405 Safari/7534.48.3';
page.settings.viewportSize = { width: 400, height: 600 };
page.open('http://m.bbc.co.uk/news/business', function (status) {
    if (status !== 'success') {
        console.log('Unable to load BBC!');
        phantom.exit();
    } else {
        window.setTimeout(function () {
            page.clipRect = { left: , top: , width: 400, height: 600 };
            page.render('bbc-page1.png');
            page.clipRect = { left: , top: 600, width: 400, height: 600 };
            page.render('bbc-page2.png');
            phantom.exit();
        }, 2000);
    }
});

It is rather simple, especially if you understand the screen capture use case of PhantomJS. The user agent tweak is necessary to ensure that the delivered page is the mobile version. Two images are produced, each corresponds to the first and second “page” the user is going to see. This is achieved by adjusting the top value to offset that page. Pretty straightforward!

What do you want to webclip today?

Related posts:

♡ this article? Explore more articles and follow me Twitter.

Share this on Twitter Facebook