WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Question about web page "remediation"

for

Number of posts in this thread: 8 (In chronological order)

From: Jan Heck
Date: Thu, May 16 2013 5:57PM
Subject: Question about web page "remediation"
No previous message | Next message →

Does anyone happen to know of a tool that will strip out inline styles and
other "nasty" stuff in old legacy web pages? I don't believe HTML Tidy
does that, and other than tedious find and replace procedures, I don't
know a quicker way to do this. I'm trying to help a non-profit make their
site accessible, and the code is horrendous.

Thanks,
Jan

From: Lucy Greco
Date: Thu, May 16 2013 6:09PM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

I wonder if the new dq amaze would help I might be miss spelling it but I
am use its do that has it they announced it a CSUN this year Lucy

Lucia Greco
Web Access Analyst
IST-Campus Technology Services
University of California, Berkeley
(510) 289-6008 skype: lucia1-greco
http://webaccess.berkeley.edu
Follow me on twitter @accessaces

-----Original Message-----
From: = EMAIL ADDRESS REMOVED =
[mailto: = EMAIL ADDRESS REMOVED = ] On Behalf Of Jan Heck
Sent: Thursday, May 16, 2013 4:58 PM
To: WebAIM Discussion List
Subject: [WebAIM] Question about web page "remediation"

Does anyone happen to know of a tool that will strip out inline styles and
other "nasty" stuff in old legacy web pages? I don't believe HTML Tidy
does that, and other than tedious find and replace procedures, I don't
know a quicker way to do this. I'm trying to help a non-profit make their
site accessible, and the code is horrendous.

Thanks,
Jan

From: Jan Heck
Date: Thu, May 16 2013 6:21PM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

Thanks, Lucy. Apparently, Deque Amaze is a server-side solution, way
beyond the means of this organization. It sure is an interesting concept,
though.

On 5/16/13 5:09 PM, "Lucy Greco" < = EMAIL ADDRESS REMOVED = > wrote:

>I wonder if the new dq amaze would help I might be miss spelling it but I
>am use its do that has it they announced it a CSUN this year Lucy
>
>Lucia Greco
>Web Access Analyst
>IST-Campus Technology Services
>University of California, Berkeley
>(510) 289-6008 skype: lucia1-greco
>http://webaccess.berkeley.edu
>Follow me on twitter @accessaces
>
>-----Original Message-----
>From: = EMAIL ADDRESS REMOVED =
>[mailto: = EMAIL ADDRESS REMOVED = ] On Behalf Of Jan Heck
>Sent: Thursday, May 16, 2013 4:58 PM
>To: WebAIM Discussion List
>Subject: [WebAIM] Question about web page "remediation"
>
>Does anyone happen to know of a tool that will strip out inline styles and
>other "nasty" stuff in old legacy web pages? I don't believe HTML Tidy
>does that, and other than tedious find and replace procedures, I don't
>know a quicker way to do this. I'm trying to help a non-profit make their
>site accessible, and the code is horrendous.
>
>Thanks,
>Jan
>
>
>>>>>>

From: Jukka K. Korpela
Date: Fri, May 17 2013 1:11AM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

2013-05-17 2:57, Jan Heck wrote:

> Does anyone happen to know of a tool that will strip out inline styles and
> other "nasty" stuff in old legacy web pages?

There are probably tools that do such things, but I doubt whether such
processing has any (positive) impact on accessibilility. After all, what
can you do with <p style="font-family: Verdana">? I mean if you were a
computer program. You could introduce an id attribute for the element,
carefully generating an id value that is unique on the page, remove the
style attribute, and put a style sheet rule like #dvbkzh12 {
font-family: Verdana } into a style element or an external style sheet.
This would not make the source code more readable, and it would not
affect the visual or other rendering, so what would be the point?

If you meant that inline styles would be just removed, then, well, you
would change the visual appearance of the page, and it would be very
hard to predict the total effect - or its impact, in any, on accessibility.

> I don't believe HTML Tidy does that

HTML Tidy has the option of "cleaning up" inline styles (style
attributes) and presentational markup. Here is an example, using the
online version http://infohound.net/tidy/ on the document

<title>foo</title>
<p style="font-size: 120%">bar</p>
<font face=Verdana>foo</font>

The result is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content "HTML Tidy for Linux/x86 (vers 25 March 2009), see www.w3.org" />

<title>foo</title>
<style type="text/css">
/*<![CDATA[*/
span.c2 {font-family: Verdana}
p.c1 {font-size: 120%}
/*]]>*/
</style>
</head>

<body>
<p class="c1">bar</p><span class="c2">foo</span>
</body>
</html>

In which sense would this be an improvement? The class names do not say
anything, so the code has become less readable. The names are
conveniently short, at the cost of being generated in a faulty way: HTML
Tidy does not check whether the original page actually contains the
classes c1 and c2!

> I'm trying to help a non-profit make their
> site accessible, and the code is horrendous.

Horrendous code generally needs to be abandoned, not improved. The more
horrendeous it is, the more it costs to fix it, and the cost is
generally larger than the cost of redesigning and rewriting the site.

But even horrendous code can be relatively accessible. If it's not
broken, don't fix it. If it has essential accessibility problems, then
you may consider fixing them if you can - this can be fairly easy in
some cases, or it can be hopeless (without rewriting everything). It
really depends on the code and on the accessibility problems.

Yucca

From: Olaf Drümmer
Date: Fri, May 17 2013 10:24AM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

Am 17 May 2013 um 09:11 schrieb Jukka K. Korpela:

> There are probably tools that do such things, but I doubt whether such
> processing has any (positive) impact on accessibilility.

I think the main advantage is you can see where your accessibility problems are....

Just think of headings done with heading tags versus headings done with larger text sizes and bolding through local formatting.

Once you know what's really going on, you could put together a decent CSS and be done with it.


Olaf

From: John E Brandt
Date: Fri, May 17 2013 11:36AM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

Assuming it is a flat/static site with just HTML, you can try something like
CSE HTML Validator for finding "nasty" code and remedying it for you with a
built in "fixer" (although I usually prefer do the editing manually).
See http://www.htmlvalidator.com/

Depending on the size of the site/number of files....it might be faster to
just start build a new site and cut and paste only the content then layer in
layouts and other embellishments (links, headings, etc.). It just might be
faster in the long run.

~j



John E. Brandt
www.jebswebs.com
= EMAIL ADDRESS REMOVED =
207-622-7937
Augusta, Maine, USA

-----Original Message-----
From: = EMAIL ADDRESS REMOVED =
[mailto: = EMAIL ADDRESS REMOVED = ] On Behalf Of Jan Heck
Sent: Thursday, May 16, 2013 7:58 PM
To: WebAIM Discussion List
Subject: [WebAIM] Question about web page "remediation"

Does anyone happen to know of a tool that will strip out inline styles and
other "nasty" stuff in old legacy web pages? I don't believe HTML Tidy does
that, and other than tedious find and replace procedures, I don't know a
quicker way to do this. I'm trying to help a non-profit make their site
accessible, and the code is horrendous.

Thanks,
Jan


messages to = EMAIL ADDRESS REMOVED =

From: Iaffaldano, Michelangelo
Date: Fri, May 17 2013 1:03PM
Subject: Re: Question about web page "remediation"
← Previous message | Next message →

Jukka, there is an area where " such processing has a positive impact on accessibility". A few years, in one large website migration project we found that, for example, whenever the pages had
<SPAN><FONT=3 COLOR=BLUE>Bla bla</FONT><SPAN><BR><BR>

they really meant
<h1>Bla bla</h1>

These constructs (i.e. semantics implied in presentational markup) were applied with surprising consistency, so that we were able to do some batch repairs and strip the rest. The manual cleanup after we ran the agent was minimal.

However I do not know if this is the case for Jan's project.

Michelangelo


-----Original Message-----
From: Jukka K. Korpela [mailto: = EMAIL ADDRESS REMOVED = ]
Sent: May 17, 2013 3:11 AM
To: = EMAIL ADDRESS REMOVED =
Subject: Re: [WebAIM] Question about web page "remediation"

2013-05-17 2:57, Jan Heck wrote:

> Does anyone happen to know of a tool that will strip out inline styles
> and other "nasty" stuff in old legacy web pages?

There are probably tools that do such things, but I doubt whether such processing has any (positive) impact on accessibilility. After all, what can you do with <p style="font-family: Verdana">? I mean if you were a computer program. You could introduce an id attribute for the element, carefully generating an id value that is unique on the page, remove the style attribute, and put a style sheet rule like #dvbkzh12 {
font-family: Verdana } into a style element or an external style sheet.
This would not make the source code more readable, and it would not affect the visual or other rendering, so what would be the point?

If you meant that inline styles would be just removed, then, well, you would change the visual appearance of the page, and it would be very hard to predict the total effect - or its impact, in any, on accessibility.

> I don't believe HTML Tidy does that

HTML Tidy has the option of "cleaning up" inline styles (style
attributes) and presentational markup. Here is an example, using the online version http://infohound.net/tidy/ on the document

<title>foo</title>
<p style="font-size: 120%">bar</p>
<font face=Verdana>foo</font>

The result is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content "HTML Tidy for Linux/x86 (vers 25 March 2009), see www.w3.org" />

<title>foo</title>
<style type="text/css">
/*<![CDATA[*/
span.c2 {font-family: Verdana}
p.c1 {font-size: 120%}
/*]]>*/
</style>
</head>

<body>
<p class="c1">bar</p><span class="c2">foo</span> </body> </html>

In which sense would this be an improvement? The class names do not say anything, so the code has become less readable. The names are conveniently short, at the cost of being generated in a faulty way: HTML Tidy does not check whether the original page actually contains the classes c1 and c2!

> I'm trying to help a non-profit make their site accessible, and the
> code is horrendous.

Horrendous code generally needs to be abandoned, not improved. The more horrendeous it is, the more it costs to fix it, and the cost is generally larger than the cost of redesigning and rewriting the site.

But even horrendous code can be relatively accessible. If it's not broken, don't fix it. If it has essential accessibility problems, then you may consider fixing them if you can - this can be fairly easy in some cases, or it can be hopeless (without rewriting everything). It really depends on the code and on the accessibility problems.

Yucca

From: Jan Heck
Date: Fri, May 17 2013 1:15PM
Subject: Re: Question about web page "remediation"
← Previous message | No next message

This is indeed one of the cases we're dealing with. When you say "batch
repairs," are you talking about applying a "find and replace" function to
multiple pages at a time within a standard web editor, or is there another
program you used?

Thanks to everyone who has responded so far!

Jan

On 5/17/13 12:03 PM, "Iaffaldano, Michelangelo"
< = EMAIL ADDRESS REMOVED = > wrote:

>Jukka, there is an area where " such processing has a positive impact on
>accessibility". A few years, in one large website migration project we
>found that, for example, whenever the pages had
><SPAN><FONT=3 COLOR=BLUE>Bla bla</FONT><SPAN><BR><BR>
>
>they really meant
><h1>Bla bla</h1>
>
>These constructs (i.e. semantics implied in presentational markup) were
>applied with surprising consistency, so that we were able to do some
>batch repairs and strip the rest. The manual cleanup after we ran the
>agent was minimal.
>
>However I do not know if this is the case for Jan's project.
>
>Michelangelo
>
>
>-----Original Message-----
>From: Jukka K. Korpela [mailto: = EMAIL ADDRESS REMOVED = ]
>Sent: May 17, 2013 3:11 AM
>To: = EMAIL ADDRESS REMOVED =
>Subject: Re: [WebAIM] Question about web page "remediation"
>
>2013-05-17 2:57, Jan Heck wrote:
>
>> Does anyone happen to know of a tool that will strip out inline styles
>> and other "nasty" stuff in old legacy web pages?
>
>There are probably tools that do such things, but I doubt whether such
>processing has any (positive) impact on accessibilility. After all, what
>can you do with <p style="font-family: Verdana">? I mean if you were a
>computer program. You could introduce an id attribute for the element,
>carefully generating an id value that is unique on the page, remove the
>style attribute, and put a style sheet rule like #dvbkzh12 {
>font-family: Verdana } into a style element or an external style sheet.
>This would not make the source code more readable, and it would not
>affect the visual or other rendering, so what would be the point?
>
>If you meant that inline styles would be just removed, then, well, you
>would change the visual appearance of the page, and it would be very hard
>to predict the total effect - or its impact, in any, on accessibility.
>
>> I don't believe HTML Tidy does that
>
>HTML Tidy has the option of "cleaning up" inline styles (style
>attributes) and presentational markup. Here is an example, using the
>online version http://infohound.net/tidy/ on the document
>
><title>foo</title>
><p style="font-size: 120%">bar</p>
><font face=Verdana>foo</font>
>
>The result is:
>
><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
>
><html xmlns="http://www.w3.org/1999/xhtml">
><head>
> <meta name="generator" content> "HTML Tidy for Linux/x86 (vers 25 March 2009), see www.w3.org" />
>
> <title>foo</title>
> <style type="text/css">
>/*<![CDATA[*/
> span.c2 {font-family: Verdana}
> p.c1 {font-size: 120%}
> /*]]>*/
> </style>
></head>
>
><body>
> <p class="c1">bar</p><span class="c2">foo</span> </body> </html>
>
>In which sense would this be an improvement? The class names do not say
>anything, so the code has become less readable. The names are
>conveniently short, at the cost of being generated in a faulty way: HTML
>Tidy does not check whether the original page actually contains the
>classes c1 and c2!
>
>> I'm trying to help a non-profit make their site accessible, and the
>> code is horrendous.
>
>Horrendous code generally needs to be abandoned, not improved. The more
>horrendeous it is, the more it costs to fix it, and the cost is generally
>larger than the cost of redesigning and rewriting the site.
>
>But even horrendous code can be relatively accessible. If it's not
>broken, don't fix it. If it has essential accessibility problems, then
>you may consider fixing them if you can - this can be fairly easy in some
>cases, or it can be hopeless (without rewriting everything). It really
>depends on the code and on the accessibility problems.
>
>Yucca
>
>
>
>>>