WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Longdesc and PDF (was:HTML5 Image Description Extension (longdesc) is Proposed Recommendation)


From: Olaf Drümmer
Date: Dec 7, 2014 5:29AM

Hi Bevy,

thanks a lot for your detailed reply!

A clarification before we look at some of the details: with links I meant PDF-internal links; I fully agree that the PDF file at hand must remain self-contained.

One thing that scares me:
> their agencies have interpreted the US federal law requiring
> electronic versions of all printed documents as meaning that the electronic
> version must match the printed version

from my point of view there should not be a legal difference between additional information whether encoded via a longdesc-like mechanism (it's there, but not visible by default) and an annex (it's there and it is visible by default). Do people with disabilities not have the right to the exact same quality of information as everybody else? Do they have to accept a non-official version of the official document? To me it looks like those agencies either have to fix their understanding of US federal law, or the law has to be fixed.

Also, having an annex in a PDF file with additional explanatory text for complex graphics and other such content (and links in the main content that point to it) has the following advantages:
- it doesn't get in the way of readers who do not need it
- it is always there, and is more likely to undergo various quality assurance mechanisms (information not visible by default, like Alt text, has a much higher rate of spelling errors, grammatical errors, inconsistency between entries, incorrect status/not updated after a change to content, and much more; if Alt were forced to be visible all the time for everyone, that rate of errors would likely go down to that of the rest of the document)
- it's not a special mechanism just for people with disabilities, rather, it is represented using the same mechanism available and used for "ordinary" content, and for every reader of the content
- it also does not need a special tool, one can create it in most document creation tools, whether Word, InDesign or something else; once someone understands the idea, it can be executed without further ado; no special tools or features required.
- having content fully available by default in a fashion that is natural for the technology (HTML, or PDF, or… ) used, will have an educational impact - more people will recognise and understand it exists, and might consider providing for such additional content in their own documents (or documents they have others create for them); and all that without sending them a teacher / accessibility trainer
- why not allow access to the additional explanatory information even with very simple PDF presenting tools, like a printer, or a cheap PDF viewer on a low cost ebook device (which likely will not offer much in terms of accessibility features). If it's there in them same way as any other content, it will always be there for everyone to consume at their discretion. Please keep in mind, such additional information is not only for people with very low or no vision.
- there is also one aspect that might hit PDF more than HTML: the only way in PDF to express content in a rich way is by using … PDF (as opposed to a plain text string; already Alt [and this applies to HTML as much as to PDF] has no mechanism to enrich the plaint text - no tags/no substructure, no attributes, no features like links leading to even more information or cross references etc.); so a longdesc attribute in the form of a plain text string is not up to the task

Now, if someone were to insist that a longdesc equivalent in the form of a plain text string must be introduced into PDF - the mechanism is already there. It is called XMP metadata. XMP metadata provide for a very rich mechanism to add information (which would not normally be visible) to visible page content. It can be associated with arbitrary objects, and with tags in the tagged PDF structure. Here a range of predefined metadata fields could be used, like dc:description (Dublin Core metadata field for description) or tiff:ImageDescription (EXIF metadata field for image description). It still wouldn't offer the richness of PDF (or HTM) content.

That much said, I would like to urge everyone to include additional explanations for graphical content in the same manner as the rest of the content: rendered by default, expressed in the same language (HTML for HTML, PDF for PDF) as the rest of the content, and not fall back to [1] just text that is [2] not normally visible/displayed/rendered.


On 7 Dec 2014, at 04:55, Chagnon | PubCom < <EMAIL REMOVED> > wrote:

> Olaf wrote: "exactly what would you envision in for PDFs in this regard?"
> I envision a defined feature, tag, attribute, or something else that would
> allow a long detailed description to be available to assistive technologies
> within the PDF, not outside it. It also needs to be easy to deploy for those
> who create documents (see the end of my comments for more).
> Olaf wrote: "Why not use a link?"
> Because links outside the PDF (such as to specific URL webpages) break over
> the lifetime of the PDF file. In large institutional environments
> (government agencies, corporations, major publishers), it's difficult,
> first, to define a specific URL while the document is being created (I.T.
> generally doesn't like specifying this before the document is finalized),
> and second, ensure that the link will still be working a year or more down
> the road.
> Plus, PDFs are often downloaded to individual computers and we can never
> assume that the end user will have a live internet connection when reading
> the document to visit a specified URL for the longdesc.
> Links inside the PDF would have to link to pages that are added to the PDF
> but are not in the printed version of the document. An example: a client's
> 360-page statistical document has approximately 100 statistical, complex,
> graphical charts. The printed version of the document is OK and doesn't need
> anything like longdesc. However the matching electronic PDF version would
> now have to have an additional 30-50 pages added at the back of the book to
> accommodate longdesc descriptions for those charts, sort of like a special
> appendix of longdesc descriptions. This produces an electronic PDF version
> that isn't identical to the printed hardcopy, and that gets my clients in a
> tizzy; their agencies have interpreted the US federal law requiring
> electronic versions of all printed documents as meaning that the electronic
> version must match the printed version. So an electronic version with an
> extra appendix doesn't meet their federal requirements.
> Olaf wrote:
> "The longdesc attribute as defined for HTML runs the risk of hiding the
> longdesc-referenced content from certain groups of users. ... What I do know
> for sure is that due to the reduced discoverability of longdesc it is
> inferior to simply using a link."
> Why is longdesc less discoverable or hidden from certain users?
> Is that the fault of the attribute itself, the W3C definition of longdesc,
> or the lack of programming by assistive technology manufacturers?
> If I recall correctly, the H1 tag wasn't discoverable by screen readers way
> back when. But over time, A.T. manufacturers have improved their software to
> discover H1 and many more tags and conventions that are now our standards.
> Thank goodness we didn't throw out heading tags back then just because
> assistive technologies wouldn't deal with them!
> What prevents longdesc from becoming a recommended
> standard/procedure/technique that can be used by those who create documents,
> as well as for A.T. manufacturers to develop functionality for their users?
> What's needed:
> I don't care whether it's longdesc or some other system of tag and/or
> attribute, but we need a method where the people who create this type of
> visually complex graphical data (e.g., charts, maps, flow charts, plans,
> technical drawings, etc.) can easily write a detailed explanation of their
> graphics.
> We need the author of the document to create this, not the accessibility
> technician who follows him, nor the web team, nor the editors, nor the
> graphic designers, nor anyone else in a typical publishing workflow who
> completes the document after the author is done.
> This task has to be done by the author. So right there, that leaves out
> links - internal PDF links or external website URL links - because the
> authors can't code them at their stage of a publication's workflow. This
> type of document usually takes a year or longer to write and I.T. can't
> determine where it will live on the website at that point.
> So a feature/utility as easy to use as Alt-Text in MS Word would be ideal:
> - Right-click and write the Alt-text for the graphic.
> - Right-click again and write the long description for the graphic.
> If it's attached to the graphic inside the document (such as an attribute on
> the <figure> tag like Alt-Text), then it stays with the graphic as the
> document is edited and reflowed, and it will carry through from MS Word into
> the exported PDF. Or if the content is converted to HTML it will travel with
> it. Or if the Word file is imported into InDesign for desktop publishing
> layout, it can carry through into that layout and its exported PDF. And if
> the document is stored in a content management system (CMS), that it travels
> with that version, too. Or when the document must conform to established
> publishing DTDs, schemas, and standards (such as PubMed's DTD for the NLM
> database).
> The functionality has to begin at the author stage and then travel through
> the half-dozen or more workflow stages of publishing, which is much greater
> than just Internet distribution alone.
> -BJC
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - -
> www.PubCom.com - Trainers, Consultants, Designers, Developers.
> Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
> Accessibility.
> Take a Sec. 508 Class in 2014 - www.Pubcom.com/classes
> > >