Skip to content

This is an updated version of the original athenapdf.

License

Notifications You must be signed in to change notification settings

daniesy/athenapdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The original AthenaPDF repo is deprecated. I've cloned the repo, adapted and updated it with new features and dependencies.

HtmlConverted

Build Status License Gitter chat

Simple, Docker-powered PDF conversions.

HTMLConverter is comprised of an [Electron][electron] command line interface (CLI) tool, and a [Go][go] microservice for converting HTML to PDF documents.

HTMLConverter transformed Arachne into a spider for challenging her as a weaver and/or weaving a tapestry that insulted the gods.

Examples:

  • Original: [Google isn’t even close as a tool for proper due diligence. Why not?][example-1] (Converted: [PDF][example-1-pdf] | [Aggressive][example-1-aggressive])
  • Original: [Panamanian Law Firm Is Gatekeeper To Vast Flow of Murky Offshore Secrets][example-2] (Converted: [PDF][example-2-pdf] | [Aggressive][example-2-aggressive])

When [aggressive mode][aggressive] is enabled, only the essential contents of a page are kept in the generated PDF document. It is a clutter-free version of the web page, perfect for reading.

Background

HtmlConverted is an open source project.

It was designed to [do one thing and to do it well][unixphil] - PDF conversions; to work together with other programs; and to be able to handle text streams, because that is a universal interface.

It aims to give users an on-demand capability to convert HTML to PDF without frills.

At the lowest level, its [CLI][cli] component ([htmlconverter][cli]) was designed to be an alternative / drop-in replacement for [wkhtmltopdf], a popular CLI tool for HTML to PDF conversions. Because of Docker the CLI syntax is a bit more complex but it's much more reliable.

(For what it's worth, wkhtmltopdf is great, but it has a horrible habit of crashing unexpectedly - especially when printing documents with invalid HTML, problematic CSS or other issues).

There is also a [microservice][weaver] component ([weaver][weaver]), allowing you to leverage HtmlConverted over HTTP.

Getting Started

CLI vs Microservice

Our [CLI][cli] tool will suffice for most simple, and everyday HTML to PDF conversions.

However, for conversions at scale / PDF conversion as a service, we recommend getting started with our [microservice][weaver] component instead.

CLI vs Microservice

The microservice is packaged with htmlconverter, and you can run both components independently.

Docker

Both components are packaged, and distributed as [Docker][docker] images.

The only dependency you will need is Docker, and the rest will be handled for you (even if you are running in an environment without a display server - [headless environment][headless]).

Quick Start

Before starting, ensure your [Docker][docker] environment is set up, and ready-to-use.

For OSX / Windows users, ensure your [Docker Machine][docker-machine] is prepared, and the appropriate environment variables are established.

CLI

asciicast

  1. docker pull daniesy/htmlconverter
  2. docker run --rm -v $(pwd):/converted/ daniesy/htmlconverter htmlconverter <input_path> [output_path]
  3. See [cli][cli] for full documentation

The [output_path] can be omitted.

Example: docker run --rm -v $(pwd):/converted/ daniesy/htmlconverter htmlconverter https://www.arachnys.com/the-long-road-to-achieving-true-perpetual-kyc/

For Windows users, an additional forward slash must precede the volume when using Git Bash / MinGW:

docker run --rm -v /$(pwd):/converted/ daniesy/htmlconverter htmlconverter https://www.arachnys.com/the-long-road-to-achieving-true-perpetual-kyc/

Alternatively, if using the Windows command prompt:

docker run --rm -v %cd%:/converted/ daniesy/htmlconverter htmlconverter https://www.arachnys.com/the-long-road-to-achieving-true-perpetual-kyc/

Microservice

asciicast

  1. docker pull daniesy/htmlconverter-service
  2. docker run -p 8080:8080 --rm daniesy/htmlconverter-service
  3. Inline conversion: http://<docker-address>:8080/convert?auth=arachnys-weaver&url=https://www.arachnys.com/the-long-road-to-achieving-true-perpetual-kyc/
  4. OR cURL, and redirect output to file: curl http://dockerhost:8080/convert\?auth\=arachnys-weaver\&url\=https://www.arachnys.com/the-long-road-to-achieving-true-perpetual-kyc/ |> out.pdf
  5. See [weaver][weaver] for full documentation

The default authentication key is arachnys-weaver. This can be changed through the WEAVER_AUTH_KEY environment variable.

The microservice can be deployed scalably to [ECS][ecs] if you want to build your own conversion farm.

License

Please note htmlconverter is NEITHER affiliated with NOR endorsed by Google Inc. and GitHub Inc.

See LICENSE.