E-mail List Archives

PDF Container tags


From: Chagnon | PubCom.com
Date: Sep 28, 2015 1:56PM

This issue comes up quite frequently in our work.

People have hissy fits about the common container tags that become embedded
in PDF tag trees when a PDF is made from InDesign, Word, and other office
software. Everyone has a different take on their purpose, meaning, and
requirements. We're trying to clarify this issue for a student's work.

Questions (and links to reference material follows):

The defined container tags in the Adobe PDF standard are <DOC>, <PART>,
<ART>, <SECT>, and <DIV>. Their definitions are loosely defined in the
Acrobat PDF Standards 3200_2008 (see table 333 beginning on page 583
pdf ). I say "loosely defined" because the only one that is adequately
defined is <DOC> which is the root element of the tag structure. Everything
else falls within it. All the other definitions could be debated from now
until the cows come home.

1. Are any of these container tags recognized by today's screen
readers and other AT? The last time we checked (last spring), they were
ignored by screen readers and the PDF tags were read top-to-bottom down the
tag tree regardless of whether there were container tags here and there or
not. Is this still the case?

2. Does it matter if the <DOC> tag is there in the PDF tag tree?

3. From the user's point of view, is there any proposed purpose for
these container tags, now or in the future?

4. And what about <SPAN> tags, do they still interfere with screen
readers and AT?

NOTE: I know these tags can have some purpose for those who create PDFs, but
I'm questioning their purpose by AT.

We couldn't find any references to these container tags when we searched the
PDF/UA standards.

We can't find any references to their correct usage in WCAG, either.

And what happened to the search utility on the WAI website?
http://www.w3.org/WAI/ It's now so difficult to find information there.

Thanks in advance,

--Bevi Chagnon