Semantic automation is when user agents, such as browsers and screen readers, create meaning and relationships where the presented meaning and relationships are missing, ambiguous, or incorrect. In short, it’s applying algorithms to try and fix things that are probably broken. It’s computers guessing for good.
As an example in the accessibility realm, if a form control does not have an associated label, the JAWS and VoiceOver screen readers will implement algorithms to auto-associate adjacent text to the control. In short, they guess what the label probably is. While this can improve the user experience in many cases, this semantic automation often fails. Even a line break or spanned text can break the current algorithms. And worse, an incorrect label for a control might be read if the layout is complex or different than the norm (such as when labels for checkboxes are placed to the left the checkboxes) .
When computers guess, the results are often not very good. But guessing is usually better than nothing.
Automation and Evaluation
I had switched my primary evaluation platform from JAWS to VoiceOver some time ago because until recently, VoiceOver did not implement semantic automation. It was very literal. If a text box was not properly labeled, it simply identified the presence of the text box, even if there was descriptive text next to it. With the release of iOS5 and Lion, VoiceOver will now auto-associate the adjacent text. When done correctly, this will be very helpful to users, but for evaluation, there’s no way to know if label text is actually associated or if VoiceOver or JAWS is just assuming it should be. And there’s no option to disable this functionality.
This creates a situation where screen reader evaluation and even user testing may not accurately reveal underlying accessibility issues. But this begs the question, if the user agent fixes the issue most of the time, is it really an issue at all?
Automation and Conformance
In order to be compliant with the Web Content Accessibility Guidelines 2.0, you have to implement accessibility. These guidelines don’t address or allow semantic automation. But what if they did? Most of the impactful success criteria could be automated by user agents to some extent:
- 1.1.1 – Alternative text: Image analysis could be performed to determine the content or description of an image.
- 1.2 – Captions and transcripts: Audio recognition could be done to auto-generate a transcript and captions, similar to YouTube’s automatic captioning functionality.
- 1.3.1 – Information and relationships: Headings could be assumed based on text size, length, and location. Form labels could be auto-associated. Table headings could be assumed based on styling and table structure. Lists could be auto-generated when numbers, bullets, sequential items, etc. are used.
- 1.3.2 – Meaningful sequence: The reading and navigation order of content could be based on the visual layout, rather than the underlying markup.
- 1.4.1 – Color, 1.4.3 – Contrast: Browsers could automatically replace colors or increase contrast if they don’t meet certain thresholds.
- 1.4.5 – Images of text: Character recognition could be implemented to replace images of text with true text.
- 2.4.1 – Bypass blocks: A user agent could analyze the document and define navigable page areas based on structure and visual presentation. VoiceOver does this now with auto-webspots.
- 2.4.4 – Link purpose: Screen readers could analyze link context to turn “Click here” into meaningful, descriptive text.
- 3.1.1 and 3.1.2 – Language of Page and Parts: The computer could determine the language of content automatically, or even automatically translate it.
And there’s more. These types of semantic automation would all be very beneficial to users with disabilities, but they will never be as good as authors just doing it right.
Defining the boundaries between what the web page author’s intentions are and what the browser can automatically do for the user is difficult. Should a screen reader automatically perform image analysis on an image that is missing alternative text? Should it do so even though this would present incorrect content much of the time? How would a user know if the screen reader is presenting true page semantics or automated semantics? How can the algorithms be improved to avoid spectacular failures in semantic automation? Etc.
Then Why Bother?
If screen readers automatically and correctly associate form controls for 95% of controls, why bother using label elements? If computers can usually determine table headers or heading structure or video transcripts, etc., then is it worth the effort to do it on my own? Of course the only way to ensure that accessibility is done right, is for authors to do it right. Semantic automation will never be perfect, yet because accessibility is about the human experience, it’s the obligation of the assistive technology to provide the best experience, regardless of the page’s accessibility or lack thereof.
The question is, then, what will ultimately lead to most optimal accessibility? Avoiding semantic automation so that authors are more motivated and required to do it right, or implementing eternally-less-than-perfect semantic automation with the knowledge that authors might never bother to do it right? As with most things in accessibility, the answer is probably somewhere in the middle. What do you think?