WebAIM - Web Accessibility In Mind

E-mail List Archives

Re: Question about web page "remediation"

for

From: Iaffaldano, Michelangelo
Date: May 17, 2013 1:03PM


Jukka, there is an area where " such processing has a positive impact on accessibility". A few years, in one large website migration project we found that, for example, whenever the pages had
<SPAN><FONT=3 COLOR=BLUE>Bla bla</FONT><SPAN><BR><BR>

they really meant
<h1>Bla bla</h1>

These constructs (i.e. semantics implied in presentational markup) were applied with surprising consistency, so that we were able to do some batch repairs and strip the rest. The manual cleanup after we ran the agent was minimal.

However I do not know if this is the case for Jan's project.

Michelangelo


-----Original Message-----
From: Jukka K. Korpela [mailto: <EMAIL REMOVED> ]
Sent: May 17, 2013 3:11 AM
To: <EMAIL REMOVED>
Subject: Re: [WebAIM] Question about web page "remediation"

2013-05-17 2:57, Jan Heck wrote:

> Does anyone happen to know of a tool that will strip out inline styles
> and other "nasty" stuff in old legacy web pages?

There are probably tools that do such things, but I doubt whether such processing has any (positive) impact on accessibilility. After all, what can you do with <p style="font-family: Verdana">? I mean if you were a computer program. You could introduce an id attribute for the element, carefully generating an id value that is unique on the page, remove the style attribute, and put a style sheet rule like #dvbkzh12 {
font-family: Verdana } into a style element or an external style sheet.
This would not make the source code more readable, and it would not affect the visual or other rendering, so what would be the point?

If you meant that inline styles would be just removed, then, well, you would change the visual appearance of the page, and it would be very hard to predict the total effect - or its impact, in any, on accessibility.

> I don't believe HTML Tidy does that

HTML Tidy has the option of "cleaning up" inline styles (style
attributes) and presentational markup. Here is an example, using the online version http://infohound.net/tidy/ on the document

<title>foo</title>
<p style="font-size: 120%">bar</p>
<font face=Verdana>foo</font>

The result is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content "HTML Tidy for Linux/x86 (vers 25 March 2009), see www.w3.org" />

<title>foo</title>
<style type="text/css">
/*<![CDATA[*/
span.c2 {font-family: Verdana}
p.c1 {font-size: 120%}
/*]]>*/
</style>
</head>

<body>
<p class="c1">bar</p><span class="c2">foo</span> </body> </html>

In which sense would this be an improvement? The class names do not say anything, so the code has become less readable. The names are conveniently short, at the cost of being generated in a faulty way: HTML Tidy does not check whether the original page actually contains the classes c1 and c2!

> I'm trying to help a non-profit make their site accessible, and the
> code is horrendous.

Horrendous code generally needs to be abandoned, not improved. The more horrendeous it is, the more it costs to fix it, and the cost is generally larger than the cost of redesigning and rewriting the site.

But even horrendous code can be relatively accessible. If it's not broken, don't fix it. If it has essential accessibility problems, then you may consider fixing them if you can - this can be fairly easy in some cases, or it can be hopeless (without rewriting everything). It really depends on the code and on the accessibility problems.

Yucca