WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Longdesc and PDF (was:HTML5 Image DescriptionExtension (longdesc) is Proposed Recommendation)

for

From: Chagnon | PubCom
Date: Dec 7, 2014 12:13PM


Thanks Olaf.
I agree with just about everything you said. (Wow, we've reached consensus
on something! grin)

Just to clarify: the US federal law about providing electronic versions of
all printed documents precedes Section 508 regulations. It was passed
sometime in the 1990s and was intended to provide electronic archival
versions of all published federal material. Essentially, the printed volumes
were deteriorating, and the original source files (word processing, desktop
publishing, etc.) were getting lost on a bazillion agency file servers.

So in that respect, that particular electronic version should match the
printed version, but I'm not a die-hard on this.

I have one concern that makes me hesitate about adding an appendix of
longdesc; in the sample I'm using (300+ pages with 100+ statistical charts)
it will add about 30-50 more pages to the printed document. My agencies
don't have the budget to cover those additional pages, so the idea has been
nixed by upper management. No additional money can be spent.

One option that has worked for some of these charts is to put a table with
the chart's data just below the graphical chart. This gives all readers,
disabled or not, access to the hard data and in a format that's navigable by
A.T. But again, that adds to the printed page count and therefore the cost
of printing and distributing.

XMP metadata might be the most efficient way to go, but heck, we don't even
have a predefined field for regular Alt-text, let alone longdesc!

Is there a chance that our WAI listmembers and Adobe's Andrew K can take
these ideas regarding XMP metadata back to the their camps? I'm told these
fields can't be added to their programs until the fields are standardized at
the international level. I'm under the impression that means ISO and
WAI/W3C, correct?

We need something to fix this shortcoming fairly quickly.

—Bevi
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com — Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
Take a Sec. 508 Class in 2014 — www.Pubcom.com/classes

-----Original Message-----
From: <EMAIL REMOVED>
[mailto: <EMAIL REMOVED> ] On Behalf Of Olaf Drümmer
Sent: Sunday, December 07, 2014 7:30 AM
To: WebAIM Discussion List
Subject: [WebAIM] Longdesc and PDF (was: Re: HTML5 Image Description
Extension (longdesc) is Proposed Recommendation)

Hi Bevy,

thanks a lot for your detailed reply!

A clarification before we look at some of the details: with links I meant
PDF-internal links; I fully agree that the PDF file at hand must remain
self-contained.

One thing that scares me:
> their agencies have interpreted the US federal law requiring
> electronic versions of all printed documents as meaning that the
> electronic version must match the printed version


from my point of view there should not be a legal difference between
additional information whether encoded via a longdesc-like mechanism (it's
there, but not visible by default) and an annex (it's there and it is
visible by default). Do people with disabilities not have the right to the
exact same quality of information as everybody else? Do they have to accept
a non-official version of the official document? To me it looks like those
agencies either have to fix their understanding of US federal law, or the
law has to be fixed.

Also, having an annex in a PDF file with additional explanatory text for
complex graphics and other such content (and links in the main content that
point to it) has the following advantages:
- it doesn't get in the way of readers who do not need it
- it is always there, and is more likely to undergo various quality
assurance mechanisms (information not visible by default, like Alt text, has
a much higher rate of spelling errors, grammatical errors, inconsistency
between entries, incorrect status/not updated after a change to content, and
much more; if Alt were forced to be visible all the time for everyone, that
rate of errors would likely go down to that of the rest of the document)
- it's not a special mechanism just for people with disabilities, rather, it
is represented using the same mechanism available and used for "ordinary"
content, and for every reader of the content
- it also does not need a special tool, one can create it in most document
creation tools, whether Word, InDesign or something else; once someone
understands the idea, it can be executed without further ado; no special
tools or features required.
- having content fully available by default in a fashion that is natural for
the technology (HTML, or PDF, or… ) used, will have an educational impact -
more people will recognise and understand it exists, and might consider
providing for such additional content in their own documents (or documents
they have others create for them); and all that without sending them a
teacher / accessibility trainer
- why not allow access to the additional explanatory information even with
very simple PDF presenting tools, like a printer, or a cheap PDF viewer on a
low cost ebook device (which likely will not offer much in terms of
accessibility features). If it's there in them same way as any other
content, it will always be there for everyone to consume at their
discretion. Please keep in mind, such additional information is not only for
people with very low or no vision.
- there is also one aspect that might hit PDF more than HTML: the only way
in PDF to express content in a rich way is by using … PDF (as opposed to a
plain text string; already Alt [and this applies to HTML as much as to PDF]
has no mechanism to enrich the plaint text - no tags/no substructure, no
attributes, no features like links leading to even more information or cross
references etc.); so a longdesc attribute in the form of a plain text
string is not up to the task

Now, if someone were to insist that a longdesc equivalent in the form of a
plain text string must be introduced into PDF - the mechanism is already
there. It is called XMP metadata. XMP metadata provide for a very rich
mechanism to add information (which would not normally be visible) to
visible page content. It can be associated with arbitrary objects, and with
tags in the tagged PDF structure. Here a range of predefined metadata fields
could be used, like dc:description (Dublin Core metadata field for
description) or tiff:ImageDescription (EXIF metadata field for image
description). It still wouldn't offer the richness of PDF (or HTM) content.

That much said, I would like to urge everyone to include additional
explanations for graphical content in the same manner as the rest of the
content: rendered by default, expressed in the same language (HTML for HTML,
PDF for PDF) as the rest of the content, and not fall back to [1] just text
that is [2] not normally visible/displayed/rendered.



Olaf



On 7 Dec 2014, at 04:55, Chagnon | PubCom < <EMAIL REMOVED> > wrote:

> Olaf wrote: "exactly what would you envision in for PDFs in this regard?"
> I envision a defined feature, tag, attribute, or something else that
> would allow a long detailed description to be available to assistive
> technologies within the PDF, not outside it. It also needs to be easy
> to deploy for those who create documents (see the end of my comments for
more).
>
> Olaf wrote: "Why not use a link?"
> Because links outside the PDF (such as to specific URL webpages) break
> over the lifetime of the PDF file. In large institutional environments
> (government agencies, corporations, major publishers), it's difficult,
> first, to define a specific URL while the document is being created (I.T.
> generally doesn't like specifying this before the document is
> finalized), and second, ensure that the link will still be working a
> year or more down the road.
>
> Plus, PDFs are often downloaded to individual computers and we can
> never assume that the end user will have a live internet connection
> when reading the document to visit a specified URL for the longdesc.
>
> Links inside the PDF would have to link to pages that are added to the
> PDF but are not in the printed version of the document. An example: a
> client's 360-page statistical document has approximately 100
> statistical, complex, graphical charts. The printed version of the
> document is OK and doesn't need anything like longdesc. However the
> matching electronic PDF version would now have to have an additional
> 30-50 pages added at the back of the book to accommodate longdesc
> descriptions for those charts, sort of like a special appendix of
> longdesc descriptions. This produces an electronic PDF version that
> isn't identical to the printed hardcopy, and that gets my clients in a
> tizzy; their agencies have interpreted the US federal law requiring
> electronic versions of all printed documents as meaning that the
> electronic version must match the printed version. So an electronic
version with an extra appendix doesn't meet their federal requirements.
>
> Olaf wrote:
> "The longdesc attribute as defined for HTML runs the risk of hiding
> the longdesc-referenced content from certain groups of users. ... What
> I do know for sure is that due to the reduced discoverability of
> longdesc it is inferior to simply using a link."
>
> Why is longdesc less discoverable or hidden from certain users?
> Is that the fault of the attribute itself, the W3C definition of
> longdesc, or the lack of programming by assistive technology
manufacturers?
>
> If I recall correctly, the H1 tag wasn't discoverable by screen
> readers way back when. But over time, A.T. manufacturers have improved
> their software to discover H1 and many more tags and conventions that are
now our standards.
> Thank goodness we didn't throw out heading tags back then just because
> assistive technologies wouldn't deal with them!
>
> What prevents longdesc from becoming a recommended
> standard/procedure/technique that can be used by those who create
> documents, as well as for A.T. manufacturers to develop functionality for
their users?
>
> What's needed:
> I don't care whether it's longdesc or some other system of tag and/or
> attribute, but we need a method where the people who create this type
> of visually complex graphical data (e.g., charts, maps, flow charts,
> plans, technical drawings, etc.) can easily write a detailed
> explanation of their graphics.
>
> We need the author of the document to create this, not the
> accessibility technician who follows him, nor the web team, nor the
> editors, nor the graphic designers, nor anyone else in a typical
> publishing workflow who completes the document after the author is done.
>
> This task has to be done by the author. So right there, that leaves
> out links - internal PDF links or external website URL links - because
> the authors can't code them at their stage of a publication's
> workflow. This type of document usually takes a year or longer to
> write and I.T. can't determine where it will live on the website at that
point.
>
> So a feature/utility as easy to use as Alt-Text in MS Word would be ideal:
> - Right-click and write the Alt-text for the graphic.
> - Right-click again and write the long description for the graphic.
>
> If it's attached to the graphic inside the document (such as an
> attribute on the <figure> tag like Alt-Text), then it stays with the
> graphic as the document is edited and reflowed, and it will carry
> through from MS Word into the exported PDF. Or if the content is
> converted to HTML it will travel with it. Or if the Word file is
> imported into InDesign for desktop publishing layout, it can carry
> through into that layout and its exported PDF. And if the document is
> stored in a content management system (CMS), that it travels with that
> version, too. Or when the document must conform to established
> publishing DTDs, schemas, and standards (such as PubMed's DTD for the NLM
database).
>
> The functionality has to begin at the author stage and then travel
> through the half-dozen or more workflow stages of publishing, which is
> much greater than just Internet distribution alone.
>
> -BJC
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - -
> - - - - - - - - - - - - - - -
> www.PubCom.com - Trainers, Consultants, Designers, Developers.
> Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
> Accessibility.
> Take a Sec. 508 Class in 2014 - www.Pubcom.com/classes
>
>
> > > list messages to <EMAIL REMOVED>

messages to <EMAIL REMOVED>