WebAIM - Web Accessibility In Mind

E-mail List Archives

RE: Why specify a DOCTYPE? Why validate?

for

From: John Foliot - bytown internet
Date: May 30, 2002 7:25AM


I'm asked this often, and the best layman's explanation I can give is that
the DTD defines the dictionary/thesaurus of the page elements so that the
user agent can "understand" what is being said. I use this analogy:

"If I told you I left my lunch in my boot, you'd think I was very strange.
But if I was standing in London, England you would probably know that my
lunch was in the trunk of my car."

Semantics is everything, but if the common reference point(s) is/are not
specified "up front" then the *possibility* of confusion exists. In
January, Wired Magazine ran a great article about the importance of
Standards (http://www.wired.com/wired/archive/10.01/standards.html). It's a
good read - highly recommended.

The DTD declares the Standard that the document is authored to. User agents
can be programmed to interpret custom tags if they have been properly
documented in a Custom DTD (IE/Microsoft is already starting to support
this). Good user agents are also programmed to properly implement the
existing tags... but the programs need to expect that the tags being used
are being used properly. To me, this make clear sense. (BTW - This is also
the premise of XML; define your tags, tell the user agent what they do, then
use your tags)

Take the LONGDESC attribute associated with the <.img> element. Right now
(to my knowledge), only IBM Homepage Reader properly supports this
attribute. This is a good thing on IBM's part, and I wish other user agents
would support this attribute properly as well. But when the IBM developers
where programming the browser, they referred to the spec or standard to know
how the attribute should be properly used, so that their agent could
properly render the final output. To be used properly, longdesc should
equal the URL of the text description of the image. But I've seen it used as
<..longdesc="blah,blah,blah about the image">, which is clearly wrong.
Should IBM add lines and lines of code to compensate for incorrectly
developed web pages, or not include the support of LONGDESC at all because
doing so is just too difficult, or should the user agent just not deliver
the "junk" being imputed into it? If the page author had used the attribute
correctly (and verified that it was being used correctly by validating the
document), then the problem would not exist, the attribute would be rendered
correctly. To me, again, this make clear sense.

As user agents become thinner and thinner clients, the capability of it's
processing power to compensate with bad code diminishes. I don't know a
whole bunch about the current state of wireless here in Canada (note to
self - more research), but I do know that a friend of mine who has a RIM
Blackberry can access web pages wirelessly. It grabs the HTML docs and
converts them to WML on the fly (very cool!). Pages which validate to HTML
4.01 are pretty good, XHTML are VERY good, but crummy code... yech. Thing
is, all those pages probably "look" the same in IE5.5 or IE6, but once you
start to repurpose the content it introduces a whole new set of previously
un-thought of variables. Yet the Valid XHTML comes through with flying
colours. Why? Because it has been properly formatted and developed. How
do we know this? Because the author bothered to verify that the page was
"well formed" or valid against the standard it was written to. And to know
which standard is being used, we need to declare it. Thus, the Document
Type Declaration (DTD).

JF



>