Word Doc to HTML: Take 2

by Erica Sadun

Hey Derrick!

Here's another, sometimes easier, way. Open the Doc file in TextEdit and then choose File -> Save As and choose HTML from the File Format pop-up.

That having been said, take note that TextEdit reads many but not all Word Docs and that details often get lost in the mix. On the up side though, I feel a lot safer opening Word attachments in TextEdit than I do in Word.

0605TextEditHTMLscaled.jpg

jpkang writes: I think Derrick's point was that the HTML that Google produces is much cleaner/readable than the standards-compliant HTML that most text editors produce nowadays (with tons of CSS tags).


TextEdit rocks at this. Open TextEdit -> Preferences. Choose "Open and Save". Set the Document Type and Styling. (I prefer to pick No CSS from the styling pop-up, XHTML 1.0 Strict from the Document Type popup, and Western (Mac OS Roman) from the Encoding pop-up.) Close the Preferences pane.


0605TextEditHTML3.jpg


0605TextEditHTML4.jpg

7 Comments


2006-05-30 02:05:03
Might that be OS-X?


If so this is a bit like driving a Rolls Royce and penny-pinching when it comes to the seat covers.


Come on guys innovate.

jpkang
2006-05-30 07:12:38
I think Derrick's point was that the HTML that Google produces is much cleaner/readable than the standards-compliant HTML that most text editors produce nowadays (with tons of CSS tags).
jpkang
2006-05-30 07:37:11
I just learned something new--thanks!


I wonder if Guy Kawasaki knew about this feature, cf. http://blog.guykawasaki.com/2006/05/kawasakicom.html

Erica Sadun
2006-05-30 07:47:36
jpkang: It's an honor to serve. Glad you liked the info. -- Erica
Derrick
2006-05-30 07:55:21
Interesting that Erica posted this follow up tip. Rich Siegel dropped me a line (he's Mr. Bare Bones himself), pointing me to this document:


http://www.stg.brown.edu/edu/tips/word_to_html_with_bbedit_1.html


It describes another means for generating valid code from Word documents.

FARfetched
2006-05-31 07:40:46
Big problem: with the Weird file I tried this with, it converted all the headings to plain ol' paragraph elements. If you want something more than that, use "Embedded CSS" and then transform paragraphs with, say, "p.p1" style attributes to the proper element (h1 or whatever).


Or use OpenOffice; it does a mostly rational job of picking the right elements for paragraphs, lists, and headings. The less cleanup required, the better.

nkvd
2006-05-31 15:14:03
The first time a client proudly sent me an HTML file from Word (thinking it would save on billable hours) my jaw turned to goo. I've always loved TextEdit... this is just one more reason! Thanks for the tip!