How to convert HTML to PDF using Python.

Published on 2008-12-17 19:37:00+00:00
PDF   Python   web  

I'm building web-based, data-driven apps using Django. Eventually (or unfortunately), I will need to generate some reports that are printer-friendly. Logically, PDF is the format for such files... so how am I going to convert my xHTML and CSS to a nice-looking PDF document?

The Django Book has a whole chapter dedicated to Generating Non-HTML Content. They seem to to be fond of ReportLab ToolKit. The caveat here, though, is that you need to know a bit about the internals of a PDF document. If you're familiar with this, the ReportLab toolkit seems to be the way to go! It has many features, and it seems to be a powerful PDF-generating tool.

Unfortunately, I know nothing about PDF internals, but I do know quite a bit about HTML and CSS. That's why xhtml2pdf.com caught my attention. If it delivers on it's promises, it parses HTML and CSS and generates PDFs (imagine that)! There's also a handy Activestate recipe using it: Recipe 572160: HTML/CSS to PDF converter.

I'm definately going to check this (HTML2PDF.org) out... so expect an update on this!

Any other suggestions?

UPDATE: xhtml2pdf works well. There's also a great
post by Greg Newman outlining how it's used in django.