Thursday, February 26, 2009

Word html to real html

Got a mail from a customer where he wanted me to use the html in from the mail in the mails that our system is going to generate. Fine I thought when I saw it. It only took me 3 seconds after viewing the source code of the email when it hit me. This is f#¤!ing word generated html. After the short setback I remembered having listened to one of Stack Overflow’s podcasts where Jeff talked about their problems with their WYSIWYG editor and the similarities that it had with decoding Word html. So a quick google search gave me the http://www.textism.com/wordcleaner . Perfect it did the work. One caveat is that it stripped all class declarations and styling, so that part I have to do by my self, and that it’s only free for files below 20kb.

BTW Jeff wrote his own parser, thou it only works with 2003 versions of Word.

No comments :

Post a Comment