WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Correct PDF/UA tagging structure for Indexes ?

for

Number of posts in this thread: 4 (In chronological order)

From: Rick Davies
Date: Tue, Jan 02 2024 10:05AM
Subject: Correct PDF/UA tagging structure for Indexes ?
No previous message | Next message →

Hello all,

The estimable PDF Association has explanatory technical notes about how to conform to the
PDF/UA standard. But those documents don't contain many illustrative examples and *no*
illustrative examples about index markup--I guess they are not intended as cookbooks. The
PDF Association also has several helpful example PDF/UA documents contributed by third
parties--none of them contain indexes. After weeks of searching, I have not been able to
find much or anything about PDF/UA index markup: no credibly tagged PDF/UA documents with
indexes. The old Adobe PDF 1.6 document has a magnificent index, but no Index tagging.
All the successor ISO PDF standard documents don't have any indexes 🙁.

So I'm wondering a) if anyone knows where such examples may be found and b) how should a
developer of a PDF output generator produce PDF/UA tagging for the following index structure?
(The objective is that the PDF output-generator should create 'born accessible' PDFs,
without the need for any remediation.)

Rough and ready index example:
Index
___
A
___

Artichokes                  3, 15-38, 133
    cooking                16, 31, 32, 33
    in oil                         31, 32
    in butter                      31, 33
    growing
    hyponetically                   24-27
    au naturelle                    20-21

Artischocken                   66, 70, 90
    Kochen                     80, 81, 92

Avocados                       55-58, 133

...
___
B
___

Bananas 65, 65

In the above 'A' and 'B' are index sub-headings, aka group titles. And the 'au naturelle',
'Artischocken' and 'Kochen' represent index entries in different languages.

What would be the complete PDF/UA tagging required to express the above index structure,
including page links? Is there a tagging structure for the above that would be compatible
with all of UA-1, UA-2, PDF 1.7 and PDF 2 ?

OTOH this mailing list "is for anyone interested in discussing *web* accessibility issues" so
perhaps the above question would be better asked somewhere else? The PDF Association does
not seem to have a forum ...

All suggestions and comments very gratefully received 🙂.

Many thanks,

Rick

--
Rick Davies, Technical Sales Manager
Palm Gate, Kenmare,
Co. Kerry, Irelandwww.miramo.com

From: Duff Johnson
Date: Tue, Jan 02 2024 10:54AM
Subject: Re: Correct PDF/UA tagging structure for Indexes ?
← Previous message | Next message →

Hi Rick,

Thanks for the excellent questions!

I can provide some partial answers…

Regarding illustrative examples… in fact, the PDF Association’s PDF Accessibility Liaison Working Group (LWG) launched a project to develop Techniques for Accessible PDF several years ago. The results of that project are previewed in this recent post:

https://pdfa.org/pdf-techniques-for-accessibility-a-new-model/

As you can see from the example provided, each Technique will include a PDF file demonstrating the technique in an “atomic” fashion to ensure clarity for both developers and end users.

The initial release of the group’s so-called “Fundamental” Techniques is slated for later this month, with subsequent releases of additional “use case” examples to cover the semantic constructs that can occur in documents (tables, lists, etc, etc.).

As to the Index element specifically; the PDF specification is… ah… deliberately vague regarding the substructure of this structure element, in part because there’s no such thing as “the right way” to design an index… sometimes a simple list is appropriate, sometimes a bunch of lists, or nested lists, or table(s)… and there can be complications such as references to page-ranges.

That said, the PDF Association is soon to publish some new guidance in this area as a usage specification for tagged PDF in the PDF 2.0 context. The same requirements will also be included in the forthcoming PDF/UA-2. However, this guidance - while improving on that provided in PDF/UA-1 - won’t offer all that you seek, as it is constrained by the necessary flexibility in the concept of “Index”. Developing canonically-correct examples of various styles of indices is precisely the work of the PDF Accessibility LWG. Come and help us get it done! :-)

Regarding fora for the discussion...

- I note that your company is already a Full member of the PDF Association (:-) and therefore has comprehensive access to all our members-only and LWG communities. You are more than welcome to join - for example - the PDF Accessibility LWG mentioned above and pose your questions, thus helping that group to develop example(s) of valid structures for Index content. Just login to the Member Area and go from there.

- The PDF Accessibility LWG - like all our Liaison Working Groups - is open to non-members who wish to contribute to projects such as Techniques development.

- Although we maintain several LinkedIn Groups dedicated to accessible PDF they have real limitations, and thus, we are actively considering some other type of forum for non-members. Hopefully I’ll have more news on this soon.

Thanks,

Duff Johnson

PDF Association, CEO
https://www.pdfa.org


> On Jan 2, 2024, at 12:05, Rick Davies via WebAIM-Forum < = EMAIL ADDRESS REMOVED = > wrote:
>
> Hello all,
>
> The estimable PDF Association has explanatory technical notes about how to conform to the
> PDF/UA standard. But those documents don't contain many illustrative examples and *no*
> illustrative examples about index markup--I guess they are not intended as cookbooks. The
> PDF Association also has several helpful example PDF/UA documents contributed by third
> parties--none of them contain indexes. After weeks of searching, I have not been able to
> find much or anything about PDF/UA index markup: no credibly tagged PDF/UA documents with
> indexes. The old Adobe PDF 1.6 document has a magnificent index, but no Index tagging.
> All the successor ISO PDF standard documents don't have any indexes 🙁.
>
> So I'm wondering a) if anyone knows where such examples may be found and b) how should a
> developer of a PDF output generator produce PDF/UA tagging for the following index structure?
> (The objective is that the PDF output-generator should create 'born accessible' PDFs,
> without the need for any remediation.)
>
> Rough and ready index example:
> > Index
> >
> ___
> A
> ___
>
> Artichokes 3, 15-38, 133
> cooking 16, 31, 32, 33
> in oil 31, 32
> in butter 31, 33
> growing
> hyponetically 24-27
> au naturelle 20-21
>
> Artischocken 66, 70, 90
> Kochen 80, 81, 92
>
> Avocados 55-58, 133
>
> ...
> ___
> B
> ___
>
> Bananas 65, 65
>
> >
> In the above 'A' and 'B' are index sub-headings, aka group titles. And the 'au naturelle',
> 'Artischocken' and 'Kochen' represent index entries in different languages.
>
> What would be the complete PDF/UA tagging required to express the above index structure,
> including page links? Is there a tagging structure for the above that would be compatible
> with all of UA-1, UA-2, PDF 1.7 and PDF 2 ?
>
> OTOH this mailing list "is for anyone interested in discussing *web* accessibility issues" so
> perhaps the above question would be better asked somewhere else? The PDF Association does
> not seem to have a forum ...
>
> All suggestions and comments very gratefully received 🙂.
>
> Many thanks,
>
> Rick
>
> --
> > Rick Davies, Technical Sales Manager
> Palm Gate, Kenmare,
> Co. Kerry, Irelandwww.miramo.com
> > > > >

From: Philip Kiff
Date: Tue, Jan 02 2024 1:45PM
Subject: Re: Correct PDF/UA tagging structure for Indexes ?
← Previous message | Next message →

Like Duff says, that's a good question!

In the absence of other practical suggestions from the list, here's my
first crack at a tag tree showing how I would probably go about it if I
were scripting an automated method to generate PDF tags from a
well-structured source text:

<H2>
Index
<H3>
A
<L>
<LI>
<Lbl>
Artichokes
<LBody>
<L>
<LI>
<LBody>
<Link>
3
,
<LI>
<LBody>
<Link>
15-38
,
<LI>
<LBody>
<Link>
133
<L>
<LI>
<Lbl>
cooking
<LBody>
<L>
<LI>
<LBody>
<Link>
16
,
<LI>
<LBody>
<Link>
31
,
<LI>
<LBody>
<Link>
32
,
<LI>
<LBody>
<Link>
33
<LI>
<Lbl>
in oil
<LBody>
<L>
<LI>
<LBody>
<Link>
31
,
<LI>
<LBody>
<Link>
32
<LI>
<Lbl>
in butter
....
<LI>
<Lbl>
Artischocken
....
<h3>
B
....

The general idea in the tag tree above is simply to tag each letter as a
heading and then use nested lists for the terms within each letter.

The case of the first item "Artichokes" is tricky because in my tagging
it has *TWO* nested lists at the same level, which may confuse some
readers. The first list is page numbers for Artichokes generally, and
the second list is the list of artichoke sub-categories. Page numbers
are then nested in a further sub-list within each sub-category item.

I considered inserting a visually hidden label like "General" for the
fist set of page numbers that refer to Artichokes generally, but I don't
think that would necessarily improve usability for actual users, and it
would make the tagging even more complicated than it already is.

I also considered flattening the list somewhat by inserting a visually
hidden phrase in parentheses "(artichokes)" before each sub-category of
artichokes. But I don't think this would make it easier to browse the
list with a screen reader and might just get in the way for some users.

To simplify the tagging, you might decide against nesting the page
numbers within their own lists and simply put all the page numbers for
an item within a single Paragraph tag with page numbers separated by
commas. The extra list level I'm using may be overkill. Though if any
entries include a dozen or more page numbers (which would be common in
an academic text) then putting the page numbers inside their own nested
list would be helpful for some users trying to make sense of a long list
of references. Screen reader software will announce the number of list
items before entering the list, and this allows a user to decide whether
to skip over the list or attempt to navigate it one item at a time - or
to use a different strategy to navigate it.

Having said all that, I've never seen a long index actually marked up
like this. And indeed, I've never personally marked up an index like
this manually either. (It would take hours and hours!) So you would want
to test your output sample with screen reader and other users before
deciding on the final format if you are integrating it into software
that generates PDF output automatically.

The vast majority of indexes that I've seen in the wild aren't even
marked up as lists, and page numbers in most indexes I've seen aren't
even linked to the actual references they cite.

Phil.

Philip Kiff
D4K Communications

On 2024-01-02 12:05 p.m., Rick Davies via WebAIM-Forum wrote:
> Hello all,
>
> The estimable PDF Association has explanatory technical notes about
> how to conform to the
> PDF/UA standard. But those documents don't contain many illustrative
> examples and *no*
> illustrative examples about index markup--I guess they are not
> intended as cookbooks. The
> PDF Association also has several helpful example PDF/UA documents
> contributed by third
> parties--none of them contain indexes. After weeks of searching, I
> have not been able to
> find much or anything about PDF/UA index markup: no credibly tagged
> PDF/UA documents with
> indexes. The old Adobe PDF 1.6 document has a magnificent index, but
> no Index tagging.
> All the successor ISO PDF standard documents don't have any indexes 🙁.
>
> So I'm wondering a) if anyone knows where such examples may be found
> and b) how should a
> developer of a PDF output generator produce PDF/UA tagging for the
> following index structure?
> (The objective is that the PDF output-generator should create 'born
> accessible' PDFs,
> without the need for any remediation.)
>
> Rough and ready index example:
> > Index
> >
> ___
> A
> ___
>
> Artichokes                  3, 15-38, 133
>     cooking                16, 31, 32, 33
>     in oil                         31, 32
>     in butter                      31, 33
>     growing
>     hyponetically                   24-27
>     au naturelle                    20-21
>
> Artischocken                   66, 70, 90
>     Kochen                     80, 81, 92
>
> Avocados                       55-58, 133
>
> ...
> ___
> B
> ___
>
> Bananas 65, 65
>
> >
> In the above 'A' and 'B' are index sub-headings, aka group titles. And
> the 'au naturelle',
> 'Artischocken' and 'Kochen' represent index entries in different
> languages.
>
> What would be the complete PDF/UA tagging required to express the
> above index structure,
> including page links? Is there a tagging structure for the above that
> would be compatible
> with all of UA-1, UA-2, PDF 1.7 and PDF 2 ?
>
> OTOH this mailing list "is for anyone interested in discussing *web*
> accessibility issues" so
> perhaps the above question would be better asked somewhere else? The
> PDF Association does
> not seem to have a forum ...
>
> All suggestions and comments very gratefully received 🙂.
>
> Many thanks,
>
> Rick
>

From: Rick Davies
Date: Wed, Jan 03 2024 11:24AM
Subject: Re: Correct PDF/UA tagging structure for Indexes ?
← Previous message | No next message

Thanks Duff and Philip for your detailed and very helpful replies. I'm blown away.

Index formats/layouts can indeed be highly complex and subject to variations in structure.
But I think (being as much or more blinkered and subject to delusion as the next person) the
example index shown is typical of index structures from the 19th century onwards--with the
exception it did not contain any reference to *primary* pages and page ranges.

I'll study both your posts in detail--lots to learn and material to consider!

Meanwhile it's perplexingly difficult to find *any* PDFs with UA tagged indexes 🙁.

Thanks again,

Rick

--
= EMAIL ADDRESS REMOVED =
Rick Davies, Technical Sales Manager
Datazone Ltd, Tel: +353 64 66 289 64
Palm Gate, Greenane, Killarney, Fax: +353 64 66 289 65
Co. Kerry, Irelandwww.miramo.com
On 02/01/2024 17:05, Rick Davies via WebAIM-Forum wrote:
> Hello all,
>
> The estimable PDF Association has explanatory technical notes about how to conform to the
> PDF/UA standard. But those documents don't contain many illustrative examples and *no*
> illustrative examples about index markup--I guess they are not intended as cookbooks. The
> PDF Association also has several helpful example PDF/UA documents contributed by third
> parties--none of them contain indexes. After weeks of searching, I have not been able to
> find much or anything about PDF/UA index markup: no credibly tagged PDF/UA documents with
> indexes. The old Adobe PDF 1.6 document has a magnificent index, but no Index tagging.
> All the successor ISO PDF standard documents don't have any indexes 🙁.
>
> So I'm wondering a) if anyone knows where such examples may be found and b) how should a
> developer of a PDF output generator produce PDF/UA tagging for the following index structure?
> (The objective is that the PDF output-generator should create 'born accessible' PDFs,
> without the need for any remediation.)
>
> Rough and ready index example:
> > Index
> >
> ___
> A
> ___
>
> Artichokes                  3, 15-38, 133
>     cooking                16, 31, 32, 33
>     in oil                         31, 32
>     in butter                      31, 33
>     growing
>     hyponetically                   24-27
>     au naturelle                    20-21
>
> Artischocken                   66, 70, 90
>     Kochen                     80, 81, 92
>
> Avocados                       55-58, 133
>
> ...
> ___
> B
> ___
>
> Bananas 65, 65
>
> >
> In the above 'A' and 'B' are index sub-headings, aka group titles. And the 'au naturelle',
> 'Artischocken' and 'Kochen' represent index entries in different languages.
>
> What would be the complete PDF/UA tagging required to express the above index structure,
> including page links? Is there a tagging structure for the above that would be compatible
> with all of UA-1, UA-2, PDF 1.7 and PDF 2 ?
>
> OTOH this mailing list "is for anyone interested in discussing *web* accessibility issues" so
> perhaps the above question would be better asked somewhere else? The PDF Association does
> not seem to have a forum ...
>
> All suggestions and comments very gratefully received 🙂.
>
> Many thanks,
>
> Rick
>