WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: PDF/UA question about <Figure> / <Caption> hierarchy (tagged pdf)

for

Number of posts in this thread: 3 (In chronological order)

From: Rick Davies
Date: Fri, May 31 2019 10:26AM
Subject: PDF/UA question about <Figure> / <Caption> hierarchy (tagged pdf)
No previous message | Next message →

This has got my head in a spin: is it best practice for a figure or image
with a caption to have a tagged PDF structure wherein the <Caption> structure
element is a sub-element of the <Figure> structure element? Or should they
both be at the same level? (Leaving <Div> grouping aside for now.)

This nice document:

'Tagged PDF Best Practice Guide'
https://www.pdfa.org/wp-content/uploads/2015/12/StructureElementsBestPracticeGuide_2016-01-19.pdf

gets near to providing the answer, but then veers off (page 6). Presumably
because ISO 32000-2 itself does not seem to 100% clear on this. FWIW my reading
of ISO 32000-2 is that <Caption> *should* be a child of <Figure>. But I can't
find any confirmatory text or examples.

Checking the tagging in the PDF version of ISO 32000-2, it appears that neither
table titles nor figure captions are tagged as <Caption>, which I naively expected
should be the case. Instead they are tagged as <P>, at the same level either
preceding or following the table or figure. Is this by best practice design, or
might it be an inadvertency?

BTW, anyone know of specific, non-Adobe developer forums for discussing PDF/UA tagging?

TIA for any help or advice ...

Rick

From: Duff Johnson
Date: Fri, May 31 2019 11:15AM
Subject: Re: PDF/UA question about <Figure> / <Caption> hierarchy (tagged pdf)
← Previous message | Next message →

Hi Rick,

> This has got my head in a spin: is it best practice for a figure or image
> with a caption to have a tagged PDF structure wherein the <Caption> structure
> element is a sub-element of the <Figure> structure element? Or should they
> both be at the same level? (Leaving <Div> grouping aside for now.)

That's a painful subject. The short version…

PDF 1.7 does not prohibit <Caption> enclosed by <Figure>.

BUT no current-generation AT (that I'm aware of) supports this construct; they stop when they encounter alternative text (i.e, on the <Figure>), and process no deeper :-(.

PDF 2.0 explicitly allows <Caption> enclosed by <Figure>.

> 'Tagged PDF Best Practice Guide'
> https://www.pdfa.org/wp-content/uploads/2015/12/StructureElementsBestPracticeGuide_2016-01-19.pdf

Funny you should mention it: this document's complete replacement will be published in early June! I'll announce it here when it does...

> gets near to providing the answer, but then veers off (page 6). Presumably
> because ISO 32000-2 itself does not seem to 100% clear on this. FWIW my reading
> of ISO 32000-2 is that <Caption> *should* be a child of <Figure>. But I can't
> find any confirmatory text or examples.

The guide to which you refer is actually about PDF 1.7, not PDF 2.0. In the forthcoming guide (which is also specific to PDF 1.7) the text says (shown in bold):

PDF 1.7 does not specify a mechanism to associate <Figure> structure elements with their <Caption> structure elements, or associate multiple figures together, or apply a caption to multiple figures.
In the context of <Figure> structure elements, it is recommended to locate the <Caption> structure element following the <Figure> structure element, as this practice ensures a reasonable context for the <Caption> is provided to users of relatively basic consumption software.

The new guide also provides this nugget (shown in bold):

PDF 2.0 updates the description of <Caption> as follows:

For lists and tables, a <Caption< structure element may be used as defined for the <L> (list) and <Table> structure elements. In addition, a <Caption> may be used for a structure element or several structure elements.

A structure element is understood to be "captioned" when a <Caption> structure element exists as an immediate child of that structure element. The <Caption> shall be the first or the last structure element inside its parent structure element. The number of captions cannot exceed 1.

While captions are often used with figures or formulas, they may be associated with any type of content.

> Checking the tagging in the PDF version of ISO 32000-2, it appears that neither
> table titles nor figure captions are tagged as <Caption>, which I naively expected
> should be the case. Instead they are tagged as <P>, at the same level either
> preceding or following the table or figure. Is this by best practice design, or
> might it be an inadvertency?

The PDF 2.0 document was produced with PDF 1.7 software… so don't look to its tagging for guidance!

> BTW, anyone know of specific, non-Adobe developer forums for discussing PDF/UA tagging?

You say "developer forums"… PDF Association members have access to internal Technical Working Groups, including one for PDF/UA. This is the group that develops the PDF Association's Best Practice Guides.

Hope this helps,

Duff.

From: chagnon
Date: Fri, May 31 2019 11:18AM
Subject: Re: PDF/UA question about <Figure> / <Caption> hierarchy (tagged pdf)
← Previous message | No next message

Starting with the easiest questions first:

Non-Adobe developer forums:
Well, WebAIM isn't an Adobe-controlled forum so this would be my best
recommendation to you. There are people from the ISO PDF and PDF/UA
committees who are on this list and chime in when needed, as well as folks
who are members of the PDF Association.

The "'Tagged PDF Best Practice Guide" is, sadly, incomplete. Forthcoming
revised versions for PDF/UA-1 and PDF/UA-2 will be available in the future.

What to do at this time:

My firm's analysis and reasoning is this:
Because PDF 2.0 is only beginning to be used by the industry, and
Because PDF/UA-2 has not yet been released, and
Because assistive technology manufacturers have not yet adopted these new
versions of the standards and retooled their A T to use them, and
Because content creation software such as Word, Adobe InDesign, and Adobe
Acrobat are in the process of building PDF/UA-2 tools for us to use to meet
these new standards, and
Because the U S federal regulation called Sec. 508 specifies PDF/UA-1 (and
WCAG 2.0), not a future standard that hasn't even been finished yet, and
Because when the new standards are in place and working, assistive
technologies will have to still be required to read and process PDFs made to
the current PDF/UA-1 standards, (whew!)

We recommend (at this time):
-- tagging the caption as <Caption> when possible, but <P> will be
sufficient.
-- placing the <Caption> outside and before or after the <Figure> tag
because, at this time, some of our A T tests can't process a caption nested
inside a <Figure> tag. Heck, some of our A T don't recognize the Caption tag
at all!

(Personally, I try to put the caption before the figure in the tag order,
based on known educational pedagogies of telling someone what you're about
to tell them, but that's not always possible. In many cases it doesn't make
sense to have the user hear the complex Alt-Text and then get the caption
information. Switch those around if you think it will be more easily
comprehended.)

It's too soon to put PDF/UA-2 into practice; the pieces of the accessibility
puzzle are controlled by many different stakeholders. The standards must be
finalized first, then they must be adopted by governments around the world,
then the assistive technology manufacturers and content authoring
manufacturers must rebuild their tools to the new standards, and then,
finally, we have the most important stakeholder group, our colleagues who
use assistive technologies.

That is a large number of ducks that have to become aligned in a row to make
a new standard the norm!

That's my 2 cents' worth.

--Bevi

- - -
Bevi Chagnon, founder/CEO | = EMAIL ADDRESS REMOVED =
- - -
PubCom: Technologists for Accessible Design + Publishing
consulting . training . development . design . sec. 508 services
Upcoming classes at www.PubCom.com/classes
- - -
Latest blog-newsletter - Accessibility Tips at www.PubCom.com/blog