WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: PDF language not recognized by screenreaders

for

From: Duff Johnson
Date: Jan 24, 2013 3:18PM


On Jan 24, 2013, at 4:27 PM, Andrew Kirkpatrick wrote:
>
> Gijs's point - as I understood it - was that setting the document language in Acrobat did not produce the desired result (i.e., AT using that language). This is, simply, because the document language setting in Acrobat acts on document-level metadata rather than on the content (or on the tags, for that matter).
>
> [AWK] The result, while undesirable, is entirely consistent with the way that PDF and AT are supposed to behave though.

That's true - it's simply that Acrobat's representing functionality that isn't being delivered. The user thought that Acrobat would actually enforce (change content attributes) a language change; not an unreasonable expectation or desire.

> My recommendation is that InDesign not use the <document> tags as the destination for the language (however determined for the document level in InDesign - I'll also be recommending a better way than currently employed) but to use the document catalog /lang entry.

Excellent; a nice economical approach that solves this particular problem.

> Insofar as Acrobat is a "PDF editor" it would make sense to have the document-langauge management feature include the ability to (optionally) over-ride existing tag and content-level language settings.
>
> Preferably, the user should not be forced to return to the source InDesign file to address this problem in a reasonable manner.
>
> [AWK] Sure. I don't think that they are, even now. For example:
> If I have a document with 100 paragraphs (or 1000, just picking a large number that I wouldn't want to manually and individually adjust) and I want this document to be in French but my InDesign thinks that the document is EN-US. I also have an English paragraph in my document. I make sure that the paragraph styles all reflect the correct language and export to PDF. I test and the language needs to be defined for the document, per the test in Acrobat, so I:
> 1) Add French as the language in the document properties dialog. This won't change how anything is read yet because the <document> tag has EN-US as the language.
> 2) I delete the language from the <document> tag. Now the entire document reads in French. This includes the English paragraph because InDesign was smart enough to not need to define the language for that paragraph as it thought that EN-US was the document language so it didn't need to be reassigned. All of the paragraphs which were set to French do have this indicated, which duplicates the document level setting in #1, but that's not a problem.
>
> That seems like a process I'd like to avoid, but isn't unreasonable, as repair goes.

I agree - it's not an unreasonable work around for that use case. To enhance it, Acrobat should (probably) notice that the user's selected a language that clashes with the lang set in the <Document> tag, and offer to fix the tag to match (or possibly, just go ahead and do it without asking).

> Are you thinking about a different use case that is more problematic?


I'm thinking generically of the situation in which users try to manage lang settings post-creation but run into collisions when lower nodes frustrate their intent (and the implied capability of the software).

One use case would be this: users can and will consciously or otherwise write words, sentences or paragraphs of text in languages other than the one for which their application is currently set. It's a pretty standard-issue mistake. When I "mark selected text" as German in MS Word, I don't see any visual indication. I could easily forget...

In such cases the PDF gets made with incorrect lang settings. Depending on the creating application, these may be found in the content, the tags, or both.

When it comes down to it, it's the intent of the Acrobat user editing the tags in the PDF that needs to be accommodated in Acrobat. We have to assume that they are trying to fix the file. That means, essentially, that lang changes made to tags in Acrobat should (optionally) warn of conflicts "below" and offer to harmonize them. The Acrobat user could then "clean the slate", with the next move being to come in and set the lang on this or that tag as per their judgement.

Greedy for fancy software as I am, I'd also want the ability to navigate the PDF based on lang settings in tags or content to preview the output file from that perspective, but that's wishful thinking, I'm sure!

Duff.