WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: HTML vs. PDF - which takes less time and resources?


From: Olaf Drümmer
Date: May 9, 2013 2:00PM

Hi Rabab,

I would like to share a post here that I have just submitted to a related question on the BCAB mailing list. It addresses to a certain degree the general part of your question, not the specific part (regarding your 20 page non-trivial word document), so take it with a grain of salt...:


before you try to get your question answered, please ask a few more questions:

[1] what is your starting point? What is the quality of the material if it already exists, or what will it be once it gets created? Is it created specifically for the purpose of distributing it in an accessible fashion? Is it really just targeting people with no or limited vision, or do other types of disability also play a role? Is it for elder people (often not so much inclined to deal with complex IT) or younger people (who don't mind learning/acquiring yet another piece of technology)? Will the content be consumed during work time at a desktop computer, or maybe also frequently on a mobile device or a laptop? Does it have to work on possibly outdated equipment? Is it really just for people with disabilities, or should it also be decently accessible to people without disabilities? Does it have to be absolutely the same for everyone, maybe out of legal considerations (how do you ensure that a conversion process from an inaccessible presentation form to a more accessible pre
sentation form does not change or drop some of the content; what if in your life insurance contract or your medication formation a zero is dropped)?

[2] what type of content is it? how is it going to be consumed by the target user group? Is it a lot of content but more for consultational reading, where you need to find stuff quickly and easily, but will only read limited portions of content? Then a lot speaks for an HTML approach along the lines of Wikipedia or similar. Is it intended to be read in one go? Then maybe EPUB is an option? Is it really a possibly large set of documents with more internal structure and presentational variation, e.g. tables, graphics - then I'd claim a well tagged PDF is the way to go.

[3] From my point of view, it is less important what format you use, but much more important whether you prepare content adequately. Failing to mark up headings or tables properly will leave you with inaccessible content regardless which format you choose. Thus the question is very important how likely it is that given your starting point/quality of existing material, skill level of people involved, tools available, budget and time available etc. that a certain format will probably be more or less accessible.

Just to give some examples: a magazine publisher will typically work with programs like Adobe InDesign. From InDesign (using recent versions of the program) it is pretty feasible and economic to create accessible PDF or EPUB, but much harder to get HTML right or even export to Word well. If you are in a corporate or government environment, often Word is used, and as long as you want to map one Word document to one piece of content provided, your options essentially are to just distribute it as a Word file or as PDF (using Word and Adobe Acrobat you can create well tagged PDF right away). If you have an XML or HTML based content repository, maybe a web based editorial content management system, then HTML might be the easiest way forward. In all cases though - and this is more often overlooked than not - if the source content you have is not well structured, you will most probably either get garbage regardless of the format, or you will have to invest a lot to get it right.

[4] New and not so new tools and assistive technology
While it can be challenging for people with disabilities to catch up with technological developments we have to take into account the fact that tools are getting better (and often less expensive). Just to mention a few recent developments:
- the free NVDA screen reader for Windows now handles tagged PDF very well; for example, in the latest release, extended and much improved table navigation features were added; so this helps with accessing well tagged PDF files
- Amazon at the beginning of May 2013 released a Kindle app for iOS that is really quite accessible - this helps with reading EPUBs (note: while Amazon favors their own MOBI-derived proprietary format, you can send your EPUB by email to you Kindle account, and it will be converted such that it can be read on your Kindle device or app)
- iBooks on iOS (iPod, iPad, iPhone) has been very accessible for EPUB for a while, PDF unfortunately is lagging behind
- the free callas pdfGoHTML plug-in for Adobe Acrobat on Mac or Windows converts tagged PDF into HTML and opens the HTML in the default browser, using easier the user defined CSS styles or offers a couple of styles for low vision or dyslexic users. Where Adobe Reader's reflow has very unfortunate limitations, callas pdfGoHTML ultimately offers a content reflow mechanism, running in your favorite browser, where just about any aspect of the content presentation can be adjusted to the liking o f the user.

Of course, if a user prefers to continue to work with a pdftotext tool from 2001 or JAWS version 5 or Lynx - these are valid choices - but it is to be asked whether others should make substantial investments to accommodate the idiosyncrasies of such essentially outdated technology, especially given some more capable recent options tend to be available more or less free of charge.


Am 9 May 2013 um 20:15 schrieb Rabab Gomaa:

> Hello,
> I am wondering if there are members proficient in producing accessible PDFs in the group to answer my question or guide me to useful resources about the topic.
> I am comparing HTML to PDF.
> Do you think tagging PDF takes more or less time and resources compared to tagging HTML?
> The question in a different way, delivering an accessible primary version would be much quicker in PDF or HTML providing that the original file exist in word format with 20 pages full of sections , subsections, complex data tables and complex images.
> Thank you,
> Rabab Gomaa
> > >