Download online web pages as PDF with Percollate

Posted on 2 views

Have ever wondered how you can download web pages on your Linux terminal as PDF files?. This guide will help you use Percollate command line tool to download online web pages as beautifully formatted PDF files.

How Percollate works

Here is how Percollate works:

  1. Fetch the page(s) using got
  2. Enhance the DOM using jsdom
  3. Pass the DOM through Mozilla/readability to strip unnecessary elements
  4. Apply the HTML template and the print stylesheet to the resulting HTML
  5. Use puppeteer to generate a PDF from the page

How to install Percollate Linux

Percollate needs Node.js version 8 or later installed on your Local system, as it uses new(ish) JavaScript syntax. Install Node.js using or guide:

Once Node.js is installed, you can then proceed to install percollate globally using either yarn or npm

For npm use:

sudo npm install -g [email protected]

For yarn, use:

sudo yarn global add percollate

Check the installed version by running:

$ percollate --version
2.2.0

For help page, use:

$ percollate --help
Usage: percollate  [options] url [url]...

Commands:

  pdf                Bundle web pages as a PDF file
  epub               Bundle web pages as an EPUB file.
  html               Bundle web pages as a HTML file.

Commmon options:

  -h, --help         Output usage information.
  -V, --version      Output program version.
  --debug            Print more detailed information.

  -o ,       Path for the generated bundle.
  --output=

  --template=  Path to a custom HTML template.

  --style=     Path to a custom CSS file.
......

How To Update Percollate on Linux

To keep the package up-to-date, you can run:

$ sudo npm install -g [email protected]
#or
$ sudo yarn global upgrade --latest percollate
yarn global v1.22.19
[1/4] Resolving packages...
[2/4] Fetching packages...
[3/4] Linking dependencies...
[4/4] Rebuilding all packages...
success Saved lockfile.
success Saved 0 new dependencies.
Done in 1.72s.

Using Percollate to download web pages as PDF

The basic commands available are:

  • percollate pdf: Bundles one or more web pages into a PDF
  • percollate epub: Bundles one or more web pages into an epub
  • percollate html: Bundles one or more web pages into an HTML file

Available options are:

  • -o, –output – The path of the resulting bundle; when omitted, the output file name is derived from the title of the web page.
  • –individual – Export each web page as an individual file.
  • –template – Path to a custom HTML template
  • –style – Path to a custom CSS
  • –css: Additional CSS styles you can pass from the command-line to override the default/custom stylesheet styles

See below Examples

Transform a single web page to PDF:

percollate pdf --output file filename.pdf https://example.com

percollate-sigle-url-min-1024x163

To bundle several web pages into a single PDF, specify them as separate arguments to the command:

percollate pdf --output filename.pdf https://example.com/page1 https://example.com/page2

percollate-mltiple-urls-single-file-min-1024x281

You can use common Unix commands and keep the list of URLs in a newline-delimited text file:

cat urls.txt | xargs percollate pdf --output filename.pdf

percollate-multiple-urls-on-file-min-1024x308

To transform several web pages into individual PDF files at once, use the –individual flag:

percollate pdf --individual --output some.pdf https://example.com/page1 https://example.com/page2

Set Custom page size / margins

The default page size is A5 (portrait). but you can use the --css option to override it using any supported CSS size:

percollate pdf --output some.pdf --css "@page  size: A3 landscape " http://example.com

Similarly, you can define using:

Custom margins: @page margin: 0
The base font size: html font-size: 10pt

Or any other style defined in the default/custom stylesheet.

Thanks for using our guide to Download Web page as PDF file.

coffee

Gravatar Image
A systems engineer with excellent skills in systems administration, cloud computing, systems deployment, virtualization, containers, and a certified ethical hacker.