Gian Sampson-Wild’s A List Apart article provides an excellent overview of WCAG 2.0 testability issues. Like Joe Clark’s article and many others before, it’s difficult to view an article as entirely objective when the author is clearly carrying a grudge against the WCAG working group. Despite my inherent suspicions, Gian provides several strong arguments for removing or modifying testability in WCAG 2.0. To summarize, Gian argues that the WCAG working group’s requirement that all success criteria be testable has resulted in less-than-optimal guidelines.
Benefits of testability
Limiting the guidelines to only those that are measurable and testable has some advantages. It allows it to be adopted or adapted in broader realms, particularly in legal arenas where it could arguably have the broadest impact. In many cases, it makes determining conformance much easier. It also allows conformance to be better determined remotely and in a more (though not at all entirely) automated manner.
On the other hand, Gian argues that because of the testability requirement, many accessibility techniques are not in WCAG 2.0, despite the fact that they would increase accessibility. Removing the testability constraint would allow recommendations for increasing access for those with some cognitive disabilities – these recommendations would almost certainly be untestable.
She also argues that many of the existing guidelines and success criteria are not very testable as is. I agree. The first and perhaps broadest success criteria (1.1.1) requires text alternatives that present “equivalent information”. But what is “equivalent information” and how would you ever test this? WCAG 2.0 defines something as testable if a machine can clearly answer “yes” or “no” to the test or if “at least 80% of knowledgeable human evaluators would agree on the conclusion.” 80% seems to be a totally arbitrary percentage. And how do you even define if a conclusion is agreed upon. There is essentially no way of testing whether the human testability definition has been met.
WCAG 2.0 also introduces some success criteria that contain seemingly capricious levels of testability. For instance, readability at a “lower secondary level” has nothing to do with the actual audience and seems to be an arbitrary measurement. Still, it is supposedly testable – though I’m not sure how. Determining if language is appropriate for a site’s content is no less measurable by humans than determining if alternative text provides “equivalent information”. The introduction of such testability levels disallows the ability for developers to generate content that best fits their unique audience, but instead it provides an arbitrary measure to which all content and presumably all users must prescribe.
Is human testability even possible?
Much (or perhaps most) of accessibility is subject to interpretation and that interpretation will vary greatly. Regarding alternative text, I challenge you to find ANY image that presents content from ANY web site and then get 8 out of 10 accessibility experts to agree on what the alt text should be. Beyond that, I think I can safely guarantee that 8 of 10 of the WCAG working group members wouldn’t agree on the alt text for the W3C web site logo. Taking the testability requirement quite literally would result in the vast majority of accessible pages not reaching even Level A conformance because you could not prove that 80% of evaluators would agree on all subjective aspects. Perhaps more important is that “human evaluators” shouldn’t be determining appropriate alternative text anyways – content creators should.
Is there a solution?
So we have a dilemma regarding testability. If WCAG 2.0 sticks to its testability mandate and keep its slightly limited and complex success criteria, it risks alienating itself due to an inability for developers to prove testability at the 80% level. Alternatively, it can allow non-testable, pseudo-testable, and more far-reaching recommendations to be included and then risk criticism and lack of adoption because it is not testable. Lack of testability was, after all, one of the primary complaints regarding WCAG 1.0.
Throwing out all testability would be a grave mistake. However, the working group would be greatly benefited by taking another look at the 80% agreement level for proving testability and also revisiting their mandate that all WCAG 2.0 success criteria be testable.