WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Line Numbers in Legal PDFs

for

From: Duff Johnson
Date: Jul 2, 2019 5:23PM


Hi Peter,

The advice I offered in 2011 (!) to which you refer is now distinctly dated, but sadly, the state-of-the-art hasn't moved much from PDF 1.7 (2006). There is a solution for line numbers in PDF 2.0 (the Artifact structure element type with subtype "LineNum"), I am not yet aware of consumer implementations (viewers, AT) that support this feature. :-(

A couple of other (less helpful than I wish to be) notes below.

> On Jul 1, 2019, at 17:43, Peter Shikli < <EMAIL REMOVED> > wrote:
>
> Access2online analyzes and remediates webpages for compliance with both WCAG and section 508. We've gotten a batch of legal documents in PDF format which have line numbers. These line numbers are not decorative. They form an integral part of legal references to the document. While searching around for an established way to address this kind of content, we came across a post from Duff Johnson on 02/22/11 where he says:
>
> "... those line-numbers do indeed need to be tagged in the PDF. If they aren't tagged, the AT user won't have a clue as to which line they are reading.
>
> Using a table is a viable approach... notwithstanding the pain & suffering required to do the work. Sadly, it's also the best approach at this time - better, in any event, than including the line-numbers in the tags for the paragraph text along with SPAN tags to indicate that these are line-numbers and not part of the content (which is the other approach)."
>
> That sounds great in theory, but in practice we quickly found a number of problems. Here are the first 3.
>
> a) Often the footnotes and blockquotes don't line up correctly with line numbers. A document may have 6 visual lines of text in 4 line numbers. A simple tabular approach doesn't work for that.

But in tagging you can focus on logical rather than visual position… or maybe I'm misunderstanding the problem - surely the line numbers aren't ambiguous? It might also depend on whether the footnotes are also themselves numbered lines.

> b) Content which would be a heading must be treated as a header if it is in a table, creating a complex table.
> This would be managable, except when the header is associated with information on the next page. Each page requires a new table since the line numbers start over at 1 on each page, and so the header information can't be transmitted across tables.

Ugh. Well, there's nothing really stopping you from including the <H1> in your <TD>, and using the line number column as the ONLY <TH> cells in your "table", which is closer to semantically correct for such "tables" in any event.

Another option (and I've not looked at this in detail) would be to tag the entire page as a list, with the line numbers tagged with <Lbl> and the line enclosed in an <LBody>. It will still be a lousy solution (as is the table) for several major reasons… but until there's more support for PDF 2.0 the available constructs just can't really handle this use-case well; I'm the first to admit it.

Yet another reason to… call your vendor and demand PDF 2.0 support!

> c) How do you handle footnotes? We move them in the reading order to occur at the end of the sentence where the reference is. With line numbers in a table this wouldn't be possible. Footnotes would have to interrupt the reading order, following the line number position, often in the middle of a paragraph spanning 2 pages, and therefore unrelated to the text before and after.

Good point. Without PDF 2.0, footnotes in line-numbered documents have to stay "in their line" I guess.

Well, we already knew the table idea was an ugly hack. :-(

> What we had been doing before finding this recommendation from Duff was very similar to using the span tags to designate line numbers ("the other approach"). Given the challenges of using a table for anything other than the simplest of legal documents, we're interested in any other thoughts on this matter, particularly anything since 2011.

More and more creation software vendors are announcing support for PDF 2.0… but consumer software… not so much.

Ask your vendor / developer to support modern PDF!

Duff.