WebAIM - Web Accessibility In Mind

E-mail List Archives

PDF Accessibility


From: Duff Johnson
Date: Jun 26, 2013 5:36PM


These are good questions; I'll try to offer specific answers accordingly.

> What I don't really understand is what my tags should actually be

Broadly speaking, it's the same question as you have in HTML - do the tags reflect the document's structure? Many of the most important PDF tags are similar to HTML tags.

> and how to fix them.

If using Adobe Acrobat XI, change the tag-type in the Tags panel (select the tag, edit the tag's name). You can, in principle, also use the Touch Up Reading Order Tool, but I'm (deliberately) avoiding that advice in this context.

> For example, when I run the Accessibilty check on the document,
> it passes, but when I view the tags, they don't make sense to me.

Yes; Acrobat's Accessibility Checker, in its current incarnation, does not offer a means of reviewing or validating the appropriateness of tags. :-(

I suggest you check out callas's pdfGoHTML (it's free) as one way to visualize the document's semantics. The VIP reader (mentioned below) is another, but neither of these is a validation tool.

> I am usually not lucky enough to get simple <H#s> and <Ps>, instead I get
> <Normal> <1stpara><P> with one line of the paragraph instead of the
> entire paragraph or even just a carriage return.

This sounds entirely normal. Unfortunate, but normal. We await smarter software; we cheer on those brave developers who dive into the fray. From a software development point of view, attempting to add tags to existing PDF files using automation is a very challenging problem.

> Usually my bulleted
> lists turn out as some kind of paragraph style. Do I need to go through
> every tag and change the properties on it to make it perfect, like HTML,
> or if it passes the Accessbility check and the reading order is OK, is
> that sufficient?

It's a fair question, and it illustrates my point above; today's Acrobat Accessibility Checker ignores certain features that are critical to accessibility.

Both logical reading order (tag order) *and* tag semantics (H#, P, TH etc) must be correct. It is *not* OK to ignore incorrect tags. Checking for the appropriateness of tags is a critical aspect of accessibility remediation in PDF just as it is in HTML.

I've seen plenty of tables in PDF tagged as a "Figure," list-items tagged as paragraphs and headings tagged as Notes. Would that be OK in HTML? Hell no! So it's not OK in PDF either.

That said, some tags matter much more than others. <Sect>, <Art>, <Div>, <Part> tags don't add value in current-generation AT, for example. There are others (<Note> comes to mind), but you should still tag correctly because over time AT *does* slowly deliver a better, richer job of reporting PDF document semantics to the reader.

> and I don't want to get into a political discussion about whether PDFs can be accessible.

It's an antiquated argument in any event. :-) The fact is that more and more software supports well-tagged PDF everyday. Who really wants to "defend" software that doesn't support the accessibility features in the world's chosen electronic document file-format??

Indeed, an all-new, totally free, PDF viewer for visually-impaired users on Mac, Windows and Linux was just released on Monday of this week! Here's the link for those who are interested:


Of course, it works beautifully with *well tagged* PDF files and works no better than a 10 year-old cell-phone on untagged PDF.

I'll be writing more about the exciting new VIP Reader down the road.

but I digress...

> I want to provide what is both legally required and what is desirable to the users.

That's as concise and reasonable a statement as one might wish for.