WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Mismatch between PDF Object Properties Content Tag and Structure Tag

for

From: Philip Kiff
Date: Jun 11, 2020 6:57PM


First, I would double check the Role Map and check that the incorrect
structure tag(s) are not somehow being mapped behind the scenes.

If the Role Map is not involved, then I wonder if the PDFs are coming
from specific software tools other than Adobe Acrobat? I've come across
some instances of PDF software tools doing some odd things with the
container tags in the Content  panel that don't match the structure
tags. Your example screen shots show the use of Span containers. Both
CommonLook Office and iText PDF library for example will generate PDFs
that apply Span containers to almost all the content in the Content
tree. Both those tools however are capable of producing a (more-or-less)
accessible PDF by producing a correct Tag tree with semantically and
structurally correct tags despite the divergence from the containers in
the Content tree.

I don't yet understand how the object and tag dictionaries work in the
PDF format (!), so I don't exactly know where those structure/tag
mismatches get stored in the object or file. In my rudimentary testing
with NVDA and JAWS, these content tree structures did not cause problems
with access to the content in the accessible tag tree. So I am guessing
that the assistive technology that is having trouble with such files may
not be using the accessibility tag information to render the content?
I'd be curious to know.

I would note that in the cases that I've looked at, if I edit any of
those containers or tags with Acrobat Pro, then the mis-match
disappears. I don't think you will get this mismatch if you use Acrobat
Pro to autotag a file or use its various editing features.

None of this solves the mystery of where that mismatched
tag/container/object info is being stored, but it may help to know about
other similar cases.

Phil.

Philip Kiff
D4K Communications


On 2020-06-11 17:34, Jonathan Avila wrote:
> I've run into situations where assistive technology is not properly working with PDF documents and it's often traced back to a mismatch between the content type and display structure tag not matching. For example, the tag is a table tag in the tags panel, in the content panel there either might not be a table container or perhaps there is a table container and the Container tag is a table. However, Acrobat shows the structure tag a table header cell in the object Properties dialog. If I look at the tag dictionary everything looks correct. If I look in the container panel and tags panel all is correct - but for some reason the Object Properties dialog will show an incorrect structure tag that isn't showing up anywhere else. I have put some screenshot showing the issue (with alt text) below - if anyone can provide information on how to address this let me know.
>
> [Object properties dialog showing table header under the content tab under structure tag.]
>
> [Object properties dialog showing table tag under the tag tab.]
>
> Jonathan