WebAIM - Web Accessibility In Mind

E-mail List Archives



From: Cliff Tyllick
Date: Sep 24, 2010 6:00PM

I apologize in advance for not including prior messages in this comment, but I would like to take this discussion in a different direction -- that is, to explain my perspective, childish though it may be ;-), on the proper role of an attribute within HTML5 that enables a connection to a description of an image. This will be a bit of a trip, but if you are invested in the discussion of whether to include longdesc in HTML5, stick with me.

People who use screen readers have told me that they need images to be described to them if they are to interact successfully with the sighted world. Perhaps the issue is as trivial as not having to ask for help to be able to retrieve the picture of the boss in the purple shirt, not the pink one, from the intranet. Still, for various reasons they need to be able either to find out how to describe an image or to find an image from its description.

Other people who use screen readers have told me that they need to know only the intended meaning of each image -- they insist, "Don't clutter up my ears with a description of meaningless pictures!"

In short, the debate in HTML4 and XHTML was whether we should do this:

or this:
alt="our dear leader in his regal purple shirt"

So we have two different needs, but within HTML5 we have only one attribute for accommodating those needs -- alt, which, as I understand it, is now explicitly denoted as the place to convey the meaning of the image, not the place to describe it.

In HTML4 and XHTML, we did have a place to describe the image -- the longdesc attribute -- but, as I understand it, there was no distinction other than length between the intended purpose of alt and the intended purpose of longdesc. Both were meant for the description of meaningful images; longdesc was to be used when that description was too long for alt.

When might that description get too long for alt? Well, one example would be when the image is a detailed graph or a complicated chart.

So far, how often are those kinds of images presented in html? In my experience, less frequently than we see longdesc used. Usually such images would be in documents that started out in a word processor or publishing package, were converted to PDF, and were then loaded to the Web in that format. In some cases, the source file for the PDF was also loaded to the Web.

And where, pray tell, do you put longdesc in a Word document, a PageMaker file, or a PDF? You don't. Those applications have no place to store a longdesc attribute. So the engines for generating the content that includes the kinds of images that most need a lengthy description afford no way to add that attribute to each image. Small wonder that people who insist on having examples of where longdesc find few, if any, that satisfy their skepticism.

And what about the other end of the communication pathway? As Josh from Ireland pointed out, user agents have not come up to speed with interpreting longdesc when it *is* there.

So this raises another problem with getting longdesc used: How does content get put on the Web?

Am I the only person involved in this discussion who works in a setting where the people who create the content would not be able to function without WYSIWYG html editors? Our agency has scores of Web developers. But except for a very few, the work these folks do as Web developers doesn't fit into their job description at all. It's one of those "other duties as assigned," so it never really enters into their performance appraisal.

We have relatively few employees who can stay up to date on accessibility and coding conventions. There's no way we can review, let alone fix, every content item loaded to our website. So it has been part of our job to educate, cajole, and wheedle these scores of developers into producing valid code and accessible content. We've made a lot of progress, even though we still have far to go.

But we would have gotten nowhere if we hadn't picked our battles and focused on the issues that made the biggest impact -- structure, plain language, valid (x)html, and proper use of the alt attribute.

In the grand scheme of things, longdesc -- its purpose poorly characterized in the specification; its need rare in content presented in html; the ability to add it to images that needed it nonexistent; the support for it by screen readers absent -- got left out.

We hadn't gotten there yet. We were waiting for the tools to become available and for the production of content to shift to html-first and away from word processor + PDF only.

How could we use longdesc for those images? Well, beyond the obvious purpose of affording a lengthy explanation of the image to people who can't see it, longdesc could provide a very real benefit to everyone who needs to know the finest detail behind a presentation. If people who could see could easily get to the lengthy description, authors could use it to go into the arcane arguments and rationalizations that are behind their illustrations, but that most readers don't even want to know about:
- Why did you find it valid to suppress the zero on this graph?
- Why is this histogram a valid presentation of this data set?
- Why is the 33rd percentile income, and not the median income, the key value?

In other words, now that you've described clearly the construction of this image, now tell me why this presentation is valid. People who can't see would benefit from at least the first part. People who can see would benefit from the second part, but also would sometimes benefit from the first. Have you never seen something in an image for the first time after the person who created it explained it to you? And have you never discovered something in an image that the person who created it failed to recognize?

A place to link to descriptions of meaningful images -- descriptions beyond the level of detail that would typically be given in a caption or the accompanying article -- would give us richer capabilities in presenting, analyzing, interpreting, and understanding the meaning behind those images.

But we don't have that in HTML5 any more.

And so what's the solution? I've seen many proposed, but not one is native to HTML. And that means that if I am going to get my content creators to do it, first I am going to have to get them to learn something that is called by another name.

But they're already feeling like they've learned far more than they should have to know under their job description. And in many cases, their supervisors agree. Heck, even I agree! So what are my chances of getting them to learn something called ARIA?

None. It won't happen.

That is what I work with every day. And I'll bet it describes the websites of many, many people who don't have the time to follow, let alone participate in, long-winded discussions of where we should take HTML from here.

But you've let me digress. I started out talking about the needs of people who rely on screen readers, and here you've let me ramble on and on about the realities of distributed Web development.

What was it the people who rely on screen readers need to know about images online?

Sometimes, if it means anything, they need to know what it means. For this, we have alt text.

Sometimes they need to have an idea how a person who can see the image would describe it. For this, we have a patchwork of partial solutions.

But we could have solved it simply and completely in HTML5 with the right specification:
- Use alt to convey succinctly the meaning you expect a person who can see this image to get from it. If the image is to convey no meaning, leave this attribute empty.
- Use longdesc to link to a description of the image, even if the image is to convey no meaning. When the user requests, the user agent -- including browsers -- should open this description without leaving the current position in the document. Closing the description should return the user to the same location.

So we would have instances like these:

alt="stop" longdesc="[link to 'red octagon']"

alt="" longdesc="[link to description of a calming pastoral scene]"

alt="earnings have doubled from first quarter (Q1) to third quarter (Q3)" longdesc="[link to description of the graph followed by an explanation of why Q1 is being compared to Q3 and not to Q1 of the previous fiscal year]"

People who feel the need to know what images look like would have a reliable location to look for that information. People who never want to be bothered with that information would never be troubled by it.

And if we had such an attribute built into HTML5, I wouldn't care if we called it "longdesc" or, so long as longdesc continues to be supported as a deprecated attribute, just "desc" or "imgdesc" or some other name. The important point is that we have two distinctly different communication needs and in HTML5 as it exists today we are accommodating only one of them. We need to accommodate both.

To accommodate both needs anticipates and prepares for the future. To accommodate only the need that has been accommodated thus far is to be forever looking backwards.

That's my personal opinion, based on real experience.