Brad's Blog

Published: 2016-04-29

Let's convert a Word Doc to HTML

html pandoc python word

tl;dr I wrote a python script to convert Word documents to mostly-clean html. Get it at https://github.com/bradmontgomery/word2html. Ah, Microsoft Word... That glorious business-class software used all-around the world. It's perfect for those long, legal documents consisting of nothing but headers, paragraphs, and bulleted lists. All of which we an easily convert into simple HTML, right. Right? File > Save As > Web Page (.htm). Easy as... No wait, was that supposed to be File ...