E-mail List Archives

Re: PDF Container tags


From: Jon Metz
Date: Sep 29, 2015 8:08AM

Hi Bevi,

Standard Grouping structure tags are not exposed to Assistive Technology
(AT), unless the user navigates the tag tree themselves. I'm writing
from my phone, so I can't view 32000 at the moment. Unfortunately I just
need to go by memory.

I *believe* these tags are intended for a "Strongly Structured" document,
that is, one that is nested using only H tags and subsequent tags.
Unfortunately (fortunately?) AT trends to have lackluster performance
dealing with Strongly Structured documents, so using Grouping tags becomes
optional, with the exception of the <Document> tag, which is required as
the root tag.

32000 does a rather poor job of explaining when to use them, and how. The
order in which they appear in the specification doesn't even help. I'll do
my best to recall their purpose though.

There can be only one <Document> tag, defined as the root. <Part> is used
to explain a segment of the overall document, such as an excerpt for a
book. There can be multiple <Part>s in a document, but if there's only one
<Part>, it's fairly superfluous. <Sect> tags can be nested within each
other, and typically note separating sections, such as Chapters or Asides.
<Div> tags can also be nested inside each other, and can be useful for
independent groups of content, such as address info. I rarely use these
though, as nesting structure too deeply can actually make it difficult for
screen readers to access the content.

A special kind of Grouping Structure tag, <TOC> contains a list of places
to find content in a document, and can only have other nested <TOC> and
<TOCI> tags. <TOCI> tags can have a <Lbl>, <Reference>, and <Link> tag
associated with them. Note that using <Lbl> outside an <L> tag might
generate a false error in PAC 2.

Now, the following is just my opinion. I prefer to use Grouping tags in my
documents for a few different reasons: First, to improve semantics when
submitting to Fed clients who might have Acrobat Pro who are disabled. In
my opinion, they provide a better idea about the structure of the Document. I
also tend to give Grouping tags a Title, so I can identify their purpose in
the document.

Further, it can be extremely useful when marking up a longer document
(especially PowerPoint conversions) in order to differentiate the
separate components of the file.

Finally, it can be useful when there's significant changes or errors to
only one part of the Document and I need to extract corrupt sections or
replace with new iterations without retagging the entire thing. For
example, there are times when there's a significant error in the document
and indeed to break out sections to find where the troubled page is. I just
break a section at a time until I find the culprit.

In the end, it's really for the remediator and not the end user, although
it can make it easier for other purposes later, such as using a third party
app to map to HTML or even reflow the document, which can sort of be seen
in use via the free Callas PDFgoHTML plug-in.

Hope this helps.


On Tuesday, September 29, 2015, Moore,Michael (Accessibility) (HHSC) <
<javascript:_e(%7B%7D,'cvml',' <EMAIL REMOVED> ');>> wrote:

> <bevi>
> But to clarify my earlier question:
> Do these various container tags -- <DOCUMENT>, <PART>, <ART>, <SECT>, and
> <DIV> -- have any effect on screen readers and other AT?
> </bevi>
> These tags do not have any impact on any AT that I have tested with (JAWS,
> Window Eyes, NVDA, ZoomText, Magic, Dragon, or VoiceOver for either OSX or
> iOS). I have found them useful when remediating document because they allow
> me to work on logical chunks in the tag tree. If you split a document
> Acrobat Pro will place each split section into a <PART> which makes it easy
> to work on a page at a time when you stitch things back together.
> Mike Moore
> Accessibility Coordinator
> Texas Health and Human Services Commission
> Civil Rights Office
> (512) 438-3431 (Office)
> -----Original Message-----
> From: WebAIM-Forum [mailto: <EMAIL REMOVED> ] On
> Behalf Of Chagnon | PubCom.com
> Sent: Monday, September 28, 2015 9:30 PM
> To: 'WebAIM Discussion List'
> Subject: Re: [WebAIM] PDF Container tags
> Thanks Ryan. You're confirming my opinion.
> And thanks for catching <Document> rather than <DOC>. Sometimes I get my
> various tagging languages/syntax flipped.
> But to clarify my earlier question:
> Do these various container tags -- <DOCUMENT>, <PART>, <ART>, <SECT>, and
> <DIV> -- have any affect on screen readers and other AT?
> They are the container tags specified in the PDF 2008 standard tag set. In
> my experience, we haven't noticed any screen readers acknowledging them in
> a document, nor stumbling over them either.
> Do they cause problems for AT?
> Do they provide any benefits for AT users?
> --Bevi Chagnon
> -----Original Message-----
> From: WebAIM-Forum [mailto: <EMAIL REMOVED> ] On
> Behalf Of Ryan E. Benson
> Sent: Monday, September 28, 2015 5:28 PM
> To: WebAIM Discussion List < <EMAIL REMOVED> >
> Subject: Re: [WebAIM] PDF Container tags
> Hi Bevi,
> > Are any of these container tags recognized by today's screen
> readers and other AT?
> To my knowledge there is not way to navigate like you can with ARIA
> regions at this time.
> > Does it matter if the <DOC> tag is there in the PDF tag tree?
> DOC isn't a standard tag, so it should be mapped to Document. If not,
> custom tags are mapped to P if not defined in Acrobat, so the various
> structures could essentially be ignored. As for having a <Document> it
> comes down how much of a purist you are. Not having one will not break the
> document unlike leaving out <html> and <body> in HTML.
> --
> Ryan E. Benson
> On Mon, Sep 28, 2015 at 3:56 PM, Chagnon | PubCom.com < <EMAIL REMOVED> >
> wrote:
> > This issue comes up quite frequently in our work.
> >
> > People have hissy fits about the common container tags that become
> > embedded in PDF tag trees when a PDF is made from InDesign, Word, and
> > other office software. Everyone has a different take on their purpose,
> > meaning, and requirements. We're trying to clarify this issue for a
> student's work.
> >
> >
> >
> > Questions (and links to reference material follows):
> >
> >
> >
> > The defined container tags in the Adobe PDF standard are <DOC>,
> > <PART>, <ART>, <SECT>, and <DIV>. Their definitions are loosely
> > defined in the Acrobat PDF Standards 3200_2008 (see table 333
> > beginning on page 583
> >
> >
> http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008
> .
> > pdf ). I say "loosely defined" because the only one that is adequately
> > defined is <DOC> which is the root element of the tag structure.
> > Everything else falls within it. All the other definitions could be
> > debated from now until the cows come home.
> >
> > 1. Are any of these container tags recognized by today's screen
> > readers and other AT? The last time we checked (last spring), they
> > were ignored by screen readers and the PDF tags were read
> > top-to-bottom down the tag tree regardless of whether there were
> > container tags here and there or not. Is this still the case?
> >
> > 2. Does it matter if the <DOC> tag is there in the PDF tag tree?
> >
> > 3. From the user's point of view, is there any proposed purpose for
> > these container tags, now or in the future?
> >
> > 4. And what about <SPAN> tags, do they still interfere with screen
> > readers and AT?
> >
> >
> >
> > NOTE: I know these tags can have some purpose for those who create
> > PDFs, but I'm questioning their purpose by AT.
> >
> >
> >
> > We couldn't find any references to these container tags when we
> > searched the PDF/UA standards.
> >
> > We can't find any references to their correct usage in WCAG, either.
> >
> > And what happened to the search utility on the WAI website?
> > http://www.w3.org/WAI/ It's now so difficult to find information there.
> >
> >
> >
> > Thanks in advance,
> >
> > --Bevi Chagnon
> >
> >
> >
> >
> >
> > > > > > archives at http://webaim.org/discussion/archives
> > > >
> > > at http://webaim.org/discussion/archives
> >
> > > at http://webaim.org/discussion/archives
> > > > > >