Friday, December 21, 2018

Puppeteer, the future of Automation

Generate screenshots and PDFs of pages. All of us have been using Chrome Browser from past many years and it's true that CHROME is the one of the best browser out there in the market.

Now, coming to the main topic, in order to test the functionality of the website, people choose Selenium for Automation and no doubt that it helped a lot till today, using automation scripts.

But if you have used Selenium you know the pain, that once the test cases start, you can't do the work, since it opens the browser and run through the website and automatically clicks and traverse along the site.

This is a waste of resources since the developer can't work on anything else when the test cases are running.

In order to overcome this Google came up with the concept of Headless browsers, where you can think of a browser running in its context in the memory, without the UI, and can be controlled via API provided by Google.

So if you want to test the functionality of the website, you can run the website in a headless browser, that means a browser without a window and you can control the flow via the API, and that API is Puppeteer.

For now, Google introduced Puppeteer in Node Js API, and supports the Chrome / Chromium in a Headless manner (by default, but you can run full, non-headless, via configuration).

You can have a demo of Puppeteer here, https://try-puppeteer.appspot.com/

Here are some of the merits of Puppeteer:
  1. Generate screenshots and PDFs of pages.
  2. Automation Scripts to test the web pages flow, and mimicking the user behavior.
  3. We can have the client data rendered in Server via NodeJs and render the same in client easily.
You can have further information regarding Puppeteer at https://github.com/GoogleChrome/puppeteer.

In this article, we will take the example of creating the PDF of the rendered HTML in Chrome.

I have created a demo, where I have a created a template in Handlebar with a Static Message and a dynamic message.

template.hbs
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
<style type="text/css" media="all">
    @import url(https://fonts.googleapis.com/css?family=Lato:300,400,700); 
</style>
<html>
    <body>
        This is generated by Puppeteer using the Handlebar template and this message is passed as a variable to the
        template
        <br />
        <hr>
        <b>{{message}}</b>
    </body>
</html>

In this file, at line no: 10, we have a dynamic variable {{message}}.

We are going to compile this template and pass the model with to generate an HTML string by using the Handlebar API and then try to create a PDF for that generated HTML.

The whole code is in index.js

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
const puppeteer = require("puppeteer");
const hbs = require("handlebars");
const fs = require("fs-extra");
const path = require("path");

/***
 * Compiles the template with values we provide
 */
const compile = async function (templateName, data) {
    const filePath = path.join(process.cwd(), templateName);
    const template = await fs.readFile(filePath, "utf-8");
    return hbs.compile(template)(data);
}

const generatePDFByteArray = async ({
    templateName = "./template.hbs",
    data
}) => {
    try {
        /***
         * Got the end html string, after compiling the template with the 'data', 
         * since template is using a variable named 'message', 
         * we passed the 'data' with key named 'message' at line 82; 
         */
        const content = await compile(templateName, data);

        /***
         * Launched the headless chrome in memory.
         */
        const browser = await puppeteer.launch();

        /***
         * Created a new page(tab)
         */
        const page = await browser.newPage();

        /***
         * Set the content of the new page
         */
        await page.setContent(content);
        /***
         * Telling chrome to emulate screen i.e how the page looks if 
         * it would have been rendered in the normal browser.
         */
        await page.emulateMedia('screen');
        /***
         * This is needed since in case your template is loading any font from internet
         * this makes sure that the call will be waiting before it actually starts 
         * preparing the pdf capturing.
         */
        await page.goto('data:text/html,' + content, {
            waitUntil: 'networkidle0'
        });
        /***
         * We created the snapshot of the page and took the byte array
         */
        const byteArray = await page.pdf({
            format: "A4",
            landscape: true,
            scale: 1.29,
            printBackground: true
        });

        const buffer = Buffer.from(byteArray, 'binary');
        /**
         * We don't need the acknowledgement of this call that is the 
         * reason we are not waiting for this call to return.
         */
        browser.close();

        return buffer;
    } catch (e) {
        console.log('gg', e)
    }
};

(async () => {
    /***
     * The value being passed to the template for handlebar to 
     * compile the template and give the html string.
     */
    let data = { message: "This is a test message" };
    let fileName = 'temp.pdf';

    let buffer = await generatePDFByteArray({ data });
    console.log('got the byte buffer');

    console.log('Opening file and writing the buffer to it');
    let handle = await fs.open(fileName, 'w');
    await fs.write(handle, buffer, 0, buffer.length);
    await fs.close(handle);
    console.log('writing done');

    console.log('Please check the ', fileName);
})();

The code is pretty self-explanatory and easy to understand. Please leave the comments in case you need any information or have any doubt,

Here is the working code of the above example: https://github.com/ankur20us/demo-puppeteer



Update #1: 

     In case colors in rendered HTML are correct but when you get the PDF for the same the colors are gettings messed up or wrong, we have to provide one property to fix it, in CSS:


1
2
3
html {
   -webkit-print-color-adjust: exact;
}

Further Details of the bug: https://github.com/GoogleChrome/puppeteer/issues/2685



Happy coding.
:)


No comments:

Post a Comment