Use Gmail to Convert Word Docs to HTML

by Derrick Story

If you have a MS Word doc that you want to convert to HTML, the last thing you'd ever use is the "Save as Web Page..." command in Word. Talk about terrible code! Instead, you can send the attachment to your Gmail account and use the "View as HTML" link. Once the page is displayed in your browser, go to "View Source" and copy the code. Most of it is very clean and quite useable. I'm surprised however, that Google doesn't use the XHTML version of the break tag...


2006-05-29 12:03:44
I have a rule that forwards all mail with MS Word attachments to Gmail ;-) Didn't notice the <br> tags.
Rick Schaut
2006-05-29 14:58:04

Actually, I use Word's Save as Web Page quite frequently. It does, however, require two things:

1) Use Word's Web-based styles (e.g. Normal (Web)) rather than Word's regular styles; and
2) Check the "Save only display information into HTML" radio button in the dialog box.

Do both, and the HTML you get is really rather clean.

Also, for step 1, if you've created a document and later want to save it to a relatively clean HTML document, you can use Word's style pane on the formatting palette to convert various styles and direct formatting to use the Web-based styles.

2006-05-30 06:36:55
OpenOffice does a pretty good job of HTML conversion. I pass the results through Tidy and a script that removes spurious "CLASS=westernn" attributes, where n is an arbitrary number.

Thanks for the Word tips, Rick. I'll have to try that "only display information" object, now that I know what it is. I've always assumed that it only saved the style sheet.

2006-07-08 04:40:48
You just saved me about 1 hour in heavy formatting converting from Word to HTML. Where do I send the beer>