WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: Use of the LANG attribute

for

Number of posts in this thread: 6 (In chronological order)

From: Swan, Henny
Date: Tue, Feb 15 2005 11:21AM
Subject: Use of the LANG attribute
No previous message | Next message →

Hello All,

We're having some discussions internally regarding use of the language attribute in sites. Generally it is clear when to mark up text that is not the natural language of the page however there is a slight grey area that is not so easy to define. This is best illustrated by looking at an online supermarket. How far do we go in regards to marking up food stuffs and foreign brands as French, Italian and so on?

Some of it seems quite clear. Phrases in recipe's that are clearly foreign are obvious candidates for being marked up. For example a cooking technique given in French, that is not assimilated into the English language.

Other words you can argue do not need to be marked up due to their common usage in the English language for example "Champaign". To have this marked up in French may actually mislead a screen reader user in terms of how it is pronounced. In addition to this brand names could be left as English, and not marked up, as people may only be accustomed to hearing them pronounced with an English accent.

Words that are clearly anglicised could therefore be considered exempt. One benchmark we are looking at is checking to see if it is in the Oxford Dictionary. Other words that could be considered exempt come from languages that are written in a different script like, for example, chow mien, the Chinese for stir-fry noodles (the Chinese pronunciation is nothing like the English one and would be barely recognisable).

But what of the grey area words like, Orecchiette (a type of pasta), or panna cotta, which people may not know. Or how about "Cuvee Royale Blanquette de Limoux Brut" does all of it get marked up, some of it or none of it?

What we are really trying to do is establish some kind of base guideline for establishing when a word is identified as a foreign language and when it is not. Any input, thoughts, ideas or comments would be great.

Many thanks, Henny




---
Henny Swan
Website Accessibility Consultant
T: 020 7391 2044
E: = EMAIL ADDRESS REMOVED =


--
DISCLAIMER:

NOTICE: The information contained in this email and any attachments is
confidential and may be privileged. If you are not the intended
recipient you should not use, disclose, distribute or copy any of the
content of it or of any attachment; you are requested to notify the
sender immediately of your receipt of the email and then to delete it
and any attachments from your system.

RNIB endeavours to ensure that emails and any attachments generated by
its staff are free from viruses or other contaminants. However, it
cannot accept any responsibility for any such which are transmitted.
We therefore recommend you scan all attachments.

Please note that the statements and views expressed in this email and
any attachments are those of the author and do not necessarily represent
those of RNIB.

RNIB Registered Charity Number: 226227

Website: http://www.rnib.org.uk

From: Robinson, Norman B - Washington, DC
Date: Tue, Feb 15 2005 11:52AM
Subject: Re: Use of the LANG attribute
← Previous message | Next message →

The actual RFC gives examples of purpose and use.
http://www.ietf.org/rfc/rfc1766.txt.

As a practical matter, I think it should be used for content
external/embedded from the web page you are on, such as an audio file or
video so you can determine what language it is without understanding
that language. I simply wouldn't identify the world as a foreign
language for the online shopping application you reference. As a
customer I want to send my five year old to order "Duck La'Orange" and
expect it to be a BRANDING or PRODUCT selection title, not understand it
means "orange duck" (if that is, in fact, what it means ;)

Also, I understand you were specific to LANG attribute, but I wanted to
also mention some of the other language dependent codings that might
affect your discussion:

The DOCTYPE is for marking your content target. E.g., <!DOCTYPE HTML
PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">. The DOCTYPE Language
(listing of possible codes:
http://www.oasis-open.org/cover/iso639a.html) Specifies the public
text language, the natural language encoding system used in the creation
of the referenced object - the default web content.

Note, to keep this relevant to this lists purpose, this can be important
to assistive technologies in general. I.e., Web Accessibility Initiative
checkpoint 3.2 (http://www.w3.org/WAI/wcag-curric/sam29-0.htm).

Regards,

Norman Robinson

-----Original Message-----
From: henny.swan [mailto: = EMAIL ADDRESS REMOVED = ]
Sent: Tuesday, February 15, 2005 1:20 PM
To: WebAIM Discussion List
Subject: [WebAIM] Use of the LANG attribute



Hello All,

We're having some discussions internally regarding use of the language
attribute in sites. Generally it is clear when to mark up text that is
not the natural language of the page however there is a slight grey area
that is not so easy to define. This is best illustrated by looking at an
online supermarket. How far do we go in regards to marking up food
stuffs and foreign brands as French, Italian and so on?

Some of it seems quite clear. Phrases in recipe's that are clearly
foreign are obvious candidates for being marked up. For example a
cooking technique given in French, that is not assimilated into the
English language.

Other words you can argue do not need to be marked up due to their
common usage in the English language for example "Champaign". To have
this marked up in French may actually mislead a screen reader user in
terms of how it is pronounced. In addition to this brand names could be
left as English, and not marked up, as people may only be accustomed to
hearing them pronounced with an English accent.

Words that are clearly anglicised could therefore be considered exempt.
One benchmark we are looking at is checking to see if it is in the
Oxford Dictionary. Other words that could be considered exempt come from
languages that are written in a different script like, for example, chow
mien, the Chinese for stir-fry noodles (the Chinese pronunciation is
nothing like the English one and would be barely recognisable).

But what of the grey area words like, Orecchiette (a type of pasta), or
panna cotta, which people may not know. Or how about "Cuvee Royale
Blanquette de Limoux Brut" does all of it get marked up, some of it or
none of it?

What we are really trying to do is establish some kind of base guideline
for establishing when a word is identified as a foreign language and
when it is not. Any input, thoughts, ideas or comments would be great.

Many thanks, Henny




---
Henny Swan
Website Accessibility Consultant
T: 020 7391 2044
E: = EMAIL ADDRESS REMOVED =


--

From: Webmaster
Date: Tue, Feb 15 2005 12:44PM
Subject: Re: Use of the LANG attribute
← Previous message | Next message →

Hello Henny,

Since you asked for comments, I would mention from my personal
experience that it is not OK to change specific French and German
letters into similar English ones. At least, if the French or German
people will see it.

The encoding ISO-8859-1 supports the specific French letters, so you may
write the French word "la cuv

From: John Foliot - WATS.ca
Date: Tue, Feb 15 2005 1:04PM
Subject: Re: Use of the LANG attribute
← Previous message | Next message →

norman.b.robinson wrote:
> As a
> customer I want to send my five year old to order "Duck La'Orange" and
> expect it to be a BRANDING or PRODUCT selection title, not
> understand it
> means "orange duck" (if that is, in fact, what it means ;)

Well, yes, "Duck a l'Orange" does indeed mean Duck in Orange sauce; it's a
recipe/presentation, not a Brand Name (AFAIK). But Norman, I think you miss
part of the point here.

One of the benefits of the lang attribute today is to recent versions of the
mainstream screen reading technologies, which can switch language modules
"on the fly" if so instructed.

Consider the words "croissant", "d

From: Jukka K. Korpela
Date: Tue, Feb 15 2005 1:16PM
Subject: Re: Use of the LANG attribute
← Previous message | Next message →

On Tue, 15 Feb 2005, norman.b.robinson wrote:

> The actual RFC gives examples of purpose and use.
> http://www.ietf.org/rfc/rfc1766.txt.

RFC 1766 was obsoleted by RFC 3066 in January 2001, i.e. over four years
ago.

> As a practical matter, I think it should be used for content
> external/embedded from the web page you are on, such as an audio file or
> video so you can determine what language it is without understanding
> that language.

As a theoretical matter, language codes can be assigned to non-HTML
resources in several ways, but LANG attribute affects only content in the
document itself (though embedded content is subject to dispute); for
linked content, you would use HREFLANG.

As a practical matter, it hardly matters at present. Have you got some
evidence of actual use of such metainformation at present? The only I know
is that browsers may let the user query the properties of an element, and
get information like the language. But very few people know about such
possibilities.

> I simply wouldn't identify the world as a foreign
> language for the online shopping application you reference. As a
> customer I want to send my five year old to order "Duck La'Orange" and
> expect it to be a BRANDING or PRODUCT selection title, not understand it
> means "orange duck" (if that is, in fact, what it means ;)

The words are still in some language, even if they constitute a proper
name. Which language would that be? According to WAI guidelines, you must
use markup to indicate any language change in a document. Whether the
requirement is reasonable is a different matter - WAI guidelines
themselves don't satisfy the requirement. But for purposes such as speech
generation, markup for language in a product title would appear to be the
right thing.

> The DOCTYPE is for marking your content target. E.g., <!DOCTYPE HTML
> PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">. The DOCTYPE Language
> (listing of possible codes:
> http://www.oasis-open.org/cover/iso639a.html)

That document is neither normative nor up-to-date any more. Authoritative
information about ISO 639 codes is available from
http://www.loc.gov/standards/iso639-2/

The string "EN" in the DOCTYPE declaration indicates the language of the
Document Type Definition. Roughly speaking, it is the language that was
used when writing the HTML specification. It has nothing to do with the
language of the content of an HTML document, and it would be an error to
replace it by anything else.

> Specifies the public text language,

Yes, but few people know what that _means_.

> the natural language encoding system used in the creation
> of the referenced object - the default web content.

Absolutely not.

> Note, to keep this relevant to this lists purpose, this can be important
> to assistive technologies in general. I.e., Web Accessibility Initiative
> checkpoint 3.2 (http://www.w3.org/WAI/wcag-curric/sam29-0.htm).

You are referring to curriculum material, which might be useful reading.
It is not normative however. And that particular page contains, among
other things, incorrect information (which is in contradiction with W3C
recommendations on HTML) about the DOCTYPE declaration, for example.
Still worse, the page is _about_ formal syntax. Let's just pretend that
we didn't see that piece of confusion.

--
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

From: Robinson, Norman B - Washington, DC
Date: Tue, Feb 15 2005 1:28PM
Subject: Re: Use of the LANG attribute
← Previous message | No next message

Right on - now I see your point and after pulling over the RFC (and update to that RFC) I understand how the alternates could be referenced by a tool/screen reader.

Providing them as you have below makes sense to me! Hope I didn't add to the confusion!

Regards,

Norman

-----Original Message-----
From: foliot [mailto: = EMAIL ADDRESS REMOVED = ]
Sent: Tuesday, February 15, 2005 3:04 PM
To: WebAIM Discussion List
Subject: Re: [WebAIM] Use of the LANG attribute



norman.b.robinson wrote:
> As a
> customer I want to send my five year old to order "Duck La'Orange" and
> expect it to be a BRANDING or PRODUCT selection title, not understand
> it means "orange duck" (if that is, in fact, what it means ;)

Well, yes, "Duck a l'Orange" does indeed mean Duck in Orange sauce; it's a recipe/presentation, not a Brand Name (AFAIK). But Norman, I think you miss part of the point here.

One of the benefits of the lang attribute today is to recent versions of the mainstream screen reading technologies, which can switch language modules "on the fly" if so instructed.

Consider the words "croissant", "d