Keith C. Perry on 23 Oct 2014 13:42:49 -0700


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] HTML 2 PDF converter?


K.S., that's postscript printing and it seems that ps2pdf conversion is rather trivial since when you print to a file on Linux the .pdf option has been included for awhile now.

J.P. Along those lines you might want to take an intermediate step to go to LaTex.  I do all my writing in Lyx and the .pdf output works well.  Links and all.  It may not gain you much if anything but that process seems to be very reliable and fast at this point.  HTML -> LaTex -> PDF might be worth trying is speed is an issue.


~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Keith C. Perry, MS E.E.
Owner, DAO Technologies LLC
(O) +1.215.525.4165 x2033
(M) +1.215.432.5167
www.daotechnologies.com


From: "K.S. Bhaskar" <bhaskar@bhaskars.com>
To: "Philadelphia Linux User's Group Discussion List" <plug@lists.phillylinux.org>
Sent: Thursday, October 23, 2014 4:23:27 PM
Subject: Re: [PLUG] HTML 2 PDF converter?

I wonder where the magic happens when a browser prints as a file a web page it is displaying…

-- Bhaskar

On Thu, Oct 23, 2014 at 4:07 PM, JP Vossen <jp@jpsdomain.org> wrote:
Is anyone aware of a good HTML to PDF converter tool, ideally one that
can be scripted on Linux?

Use case: automated conversion of a large HTML file created by using
a web API to dump content out of a cloud provider into an training doc,
more or less.

I'm aware of a bunch of tools, but none thrill me.  In particular, size,
making links, doing good navigation and embedding fonts (gotta love the
marketing folks :) would all be nice.

1) xhtml2pdf just works but is very slow, does not create links and
creates large PDF files (32M).  It does nice heading navigation though.

2) wkhtmltopdf just works but does not create links and creates large
PDF files (21M).  It does thumbnails but not nice heading navigation.

3) html2ps + ps2pdf kinda worked but was really slow, creates really
ugly PDFs (but they do have links), and ps2pdf crashed even though it
did create a PDF.

4) Pandoc I can't get to work right and am out of time to fuss with it.

5) Fop I've played with (e.g., for DocBook) and as I recall it's a giant
pain, is written Java (see giant pain), and I'm not sure it takes HTML
input anyway.

6) There are various on-line, PHP, Perl & Python modules/libraries, but
it seems like there should already be a good tool written that I don't
need the cloud for.  (I know, this content is already in the cloud so
who cares?  It's the principle of the thing. :)

Thanks,
JP
----------------------------|:::======|-------------------------------
JP Vossen, CISSP            |:::======|      http://bashcookbook.com/
My Account, My Opinions     |=========|      http://www.jpsdomain.org/
----------------------------|=========|-------------------------------
"Microsoft Tax" = the additional hardware & yearly fees for the add-on
software required to protect Windows from its own poorly designed and
implemented self, while the overhead incidentally flattens Moore's Law.
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug


___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug