WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Re: PDF/A accessibility

for

Number of posts in this thread: 4 (In chronological order)

From: ckrugman
Date: Wed, Sep 07 2011 12:30AM
Subject: Re: PDF/A accessibility
No previous message | Next message →

this is interesting because in many instances I find that the use of the raw
print stream provides the most accurate reading with JAWS in many documents.
In many documents words will be ran together possibly caused by how JAWS may
be viewing the document when reading.
Chuck
----- Original Message -----
From: "Duff Johnson" < = EMAIL ADDRESS REMOVED = >
To: "WebAIM Discussion List" < = EMAIL ADDRESS REMOVED = >
Sent: Wednesday, August 10, 2011 11:04 AM
Subject: Re: [WebAIM] PDF/A accessibility


> All,
>
> Karen McCall has pointed out to me that confusion on the question of PDF
> "reading order" may be due in part to my failure to use a term that may be
> more familiar to the AT community: "raw print stream".
>
> So... in brief, here's a translation of PDF terms - I hope this clarifies
> things....
>
> "reading order" = the raw print-stream (interesting for rendering)
>
> "logical reading order" = the sequence (and nesting) of PDF tags
> (interesting for accessibility)
>
> In the absence of tags (or software capable of reading tags), AT users are
> typically stuck with the raw print stream, which (naturally) leaves them
> unimpressed (at best) or utterly frustrated (the typical case).
>
> The solution:
>
> 1) The software used to consume PDFs (both the PDF reading software
> itself and the AT) must understand PDF tags.
>
> 2) The PDF in question must be tagged correctly, just as images must have
> alt. text, tables must include valid table structure, etc, etc.
>
> I hope this helps.
>
> Duff Johnson
>
> US Committee for ISO/DIS 14289 (PDF/UA), Chair
>
> p +1.617.283.4226
> e = EMAIL ADDRESS REMOVED =
> t http://www.twitter.com/duffjohnson
> w http://www.duff-johnson.com
>
>
>
> On Aug 10, 2011, at 8:29 AM, Humbert, Joseph A wrote:
>
>> I have to say that after reading your article that it seems to mislead
>> the audience. The article makes many correct statements concerning PDF
>> accessibility, except for:
>>
>> "And that's why we can safely and responsibly ignore reading order when
>> considering accessibility in PDF."
>>
>> It is true that if a PDF is tagged correctly and the software used to
>> consume can read the tags then the "logical reading order" will be
>> correct, but as you even point out it prior emails, "'logical reading
>> order' is the concept of interest". Therefore, some type of reading
>> order, "logical" in this case, is very important. For many users of
>> assistive technology (AT), the order in which the AT software reads the
>> content is how the user perceives the logical reading order. Headings and
>> other tags help to improve a user's understanding of logical reading
>> order and allow advanced users to navigate the document in their own
>> "logical" reading order, but never the less reading order IS important.
>>
>> Please don't get me wrong, the articles message about tagging PDF files
>> and using software that supports creating accessible PDFs is wonderful.
>> You seem to be well versed in accessibility issues. My interpretation of
>> Ron's point is that "reading order" in some form is important. I may
>> interject that I believe Ron's comments come from the point of view that
>> many programs which create PDF/A documents do not automatically tag the
>> document correctly, thus creating inaccessible PDFs.
>>
>> "Pagination" is also extremely important, particularly when a textbook or
>> print article is being converted to a PDF. When this happens, my opinion
>> is that the original "pagination" of the book/article should be
>> preserved. Ron, I'm not sure of your complaint with the PDF/A or PDF/UA
>> specification is in terms of "pagination" so you will have to comment
>> further.
>>
>> Back to the original question of the post, it seems as though the PDF/A
>> specification does not have any accessibility limitations included in the
>> specification. I have not read the full PDF/A specification so I cannot
>> be 100% sure. Unfortunately, the software used to implement the
>> specification in creation of a PDF/A file may produce accessibility
>> issues that may have to be addressed manually.
>>
>> Please forgive me if I have misinterpreted something.
>>
>> Joe Humbert, Assistive Technology and Web Accessibility Specialist
>> UITS Adaptive Technology and Accessibility Centers
>> Indiana University, Indianapolis and Bloomington
>> 535 W Michigan St. IT214 E
>> Indianapolis, IN 46202
>> Office Phone: (317) 274-4378
>> Cell Phone: (317) 644-6824
>> = EMAIL ADDRESS REMOVED =
>> http://iuadapts.Indiana.edu/
>>

From: Karlen Communications
Date: Wed, Sep 07 2011 4:42AM
Subject: Re: PDF/A accessibility
← Previous message | Next message →

This is one of the reasons it is available. For many untagged documents it
can provide the best rendering of content depending on how complicated the
layout is and what the raw print stream is. It is also why we have the
option for left to right, top to bottom for untagged documents...or
'randomly tagged documents (documents where Tags have been added but not
verified or corrected).

In some documents when you either have to use OCR then the "virtual Tags" or
just the "virtual Tags" the character spacing is off and words do run
together. Being able to get a different view of the content often helps in
decoding/reading content.

Cheers, Karen


From: Duff Johnson
Date: Wed, Sep 07 2011 6:36AM
Subject: Re: PDF/A accessibility
← Previous message | Next message →

I'll offer a couple of minor clarifications.... (and here I will earn - once again - my well-deserved reputation for pedantry... I beg forgiveness in advance....)

On Sep 7, 2011, at 6:43 AM, Karlen Communications wrote:

> This is one of the reasons it is available.

The raw print stream is 'available' for two reasons:

1) It has to be available - otherwise users could not print the document.
2) From 1993-1999 the raw print stream was the ONLY way to extract content from PDF for accessibility (and other) purposes.

Neither of these reasons is directed towards accessibility. The raw print stream is not an accessibility model or mechanism in its own right. The only 'purpose' of the raw print stream is rendering the page on-screen or in-print. You may get acceptable results in a different utilization, but only as a coincidence, not by design.

> For many untagged documents it
> can provide the best rendering of content depending on how complicated the
> layout is and what the raw print stream is.

In the absence of tags, the raw print stream is the ONLY available option :-(. Of course, some software may try to 'be clever' - and interpret the print stream in order to (attempt to) impose logical structure... or "hack" the print stream for subsequent use by software that cannot read tags. These approaches have very very limited utility. Far better to simply insist on correctly tagged PDF (and on software with the brains to read PDF tags).

> It is also why we have the
> option for left to right, top to bottom for untagged documents...or
> 'randomly tagged documents (documents where Tags have been added but not
> verified or corrected).
>
> In some documents when you either have to use OCR then the "virtual Tags" or
> just the "virtual Tags" the character spacing is off and words do run
> together. Being able to get a different view of the content often helps in
> decoding/reading content.

There are many sources of "run-together" words, and they are all, without exception, evidence of a poorly constructed PDF file.

Duff Johnson

US Committee for ISO/DIS 14289 (PDF/UA), Chair

p +1.617.283.4226
e = EMAIL ADDRESS REMOVED =
t http://www.twitter.com/duffjohnson
w http://www.duff-johnson.com

From: Karlen Communications
Date: Wed, Sep 07 2011 7:03AM
Subject: Re: PDF/A accessibility
← Previous message | No next message

Agreed.

Cheers, Karen