WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Word to HTML Conversion


From: Jukka K. Korpela
Date: Aug 10, 2004 1:23AM

On Mon, 9 Aug 2004, cfrench_us wrote:

> Can anyone recommend a good tool to convert Word docs to HTML that is
> 508 compliant.

Microsoft Word. :-) You need human brain, too, though - with some
understanding of the problems that accessibility guidelines try to

If you use a new enough Word, it has (in the File/Save As dialogue)
a "Compact HTML" or "Filtered HTML" format, which means that Word produces
much simpler markup. For old versions of Word, you can download add-on
software from the Microsoft site, creating similar functionality as
"Export to Compact HTML" option in the File menu.

Then you can use any simple editor to remove the style sheet that Word has
created into the Compact HTML format - a style sheet that tries to
preserve the exact visual formatting that the author used when composing
the document (this is virtually the opposite of what we are trying to
achieve in accessibility!). Then test the HTML file and start applying
usual accessibility checks and fixes. If you only aim at 508 compliance,
you can use e.g. the checklist
Note that checklists tend to be very general; they contain a lot of things
that is not relevant to documents converted from Word format. The key
things to note are, in my opinion, items (a) (make sure all images have
alt texts that work as substitutes for the images e.g. in speech
presentation, (c) (sometimes Word-generated documents use color as the
only means of conveying some information - try to fix this by adding
suitable markup), (d) (test the generated HTML _before_ adding any styling
of your own; add markup for headings, emphasis, etc., if needed),
(g) and (h) (add identifying markup if you have tables).

A note on accessibility that goes beyond the lame 508 regulation and is
crucial to billions of potential users: check the language (cf. to WAI
guideline 14: http://www.w3.org/TR/WCAG10/#gl-facilitate-comprehension ).
Since Word has good checking tools, they should be deployed before the
conversion from Word to HTML format. (Word can read HTML documents, but
when doing so it loses much of the accessibility improvements you've

I have noticed that when Word issues a message about a potential
misspelling or grammar error, it is usually right even when it is wrong.
That is, if the message is based on a wrong analysis of the text, then
this indicates that the text is ambiguous, easy to misunderstand, or just
too complex to many human readers as well - especially to people with
cognitive disabilities or limited skills in the language in question.
So even if you don't do all the fixes that Word suggests, you should
consider whether the text needs reformulation whenever Word complains

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/