E-mail List Archives

Re: Use of <H> tag in PDF


From: Duff Johnson
Date: Aug 30, 2013 7:04AM

> On a similar note, what do you do when you run out of headings?

<giggle> Ah yes - one of the premiere showcase examples of HTML being broken by definition.

What genius decided there exist only 6 levels of headings in all of creation? (Sorry Tim!)

Sadly, the same thinking (to a lesser extent) infected PDF back in the day as well. While PDF/UA provides for unlimited heading levels the full solution in PDF will have to wait for PDF 2.0 (ISO 32000-2) which will add H7, H8, H9, etc. (without limit) to PDF itself. PDF 2.0 will also include a <Title> tag, thus providing a definitive way out of the undesirable practice of using H1 for the Title.

> I have been in situations where using an H1 for the document title meant I
> ran out of heading numbers further down the document. (It was an in-depth
> legal style document.) There I used H for the document title and H1s for
> the major sections.

Titles are not structural elements in a document. The document's title should be placed in the document's metadata "Title" field; it's fully accessible to AT in this context without reference to headings (or heading levels) at all.

H is not 'above' H1; there's no reason to think that AT would (or could) do the right thing in this case. What does H before H1 mean in structural terms? I have no idea. This "solution" is a hack that (if it works today on some AT) likely relies on some behavior that's going to change once the AT / PDF reader in question supports standards for accessible electronic documents (PDF/UA) and ISO 32000-2 (PDF 2.0).

> Conceptually, as a PDF is multi-page you could consider it more like a
> website than a web page

Nah. It's a document - not a "page" and not a "website".

There were documents long before there were websites; there will be documents long after websites have been replaced with HyperText Clouds or whatever other name we'll use in 20 years time.

Trying to squeeze documents into the conceptual framework of the web is one of the great goofs of the HTML-centric accessibility world.

> , but I'm still not sure what you'd do with the
> document title.

Place the title in the document's metadata; any PDF-aware AT readily finds it there. Tag title text appearing on the page with a <P> tag (unless it is also serving a structural role in addition to being a title).