E-mail List Archives

Thread: ISO Language codes and AT

for

Number of posts in this thread: 8 (In chronological order)

From: Karl Groves
Date: Mon, Nov 24 2014 9:20AM
Subject: ISO Language codes and AT
No previous message | Next message →

Quick question: Which is the proper ISO language code standard
supported by ATs? ISO 639-1, 639-2, 639-3, or do they all work?

TIA

--

Karl Groves
www.karlgroves.com
@karlgroves
http://www.linkedin.com/in/karlgroves
Phone: +1 410.541.6829

Modern Web Toolsets and Accessibility
https://www.youtube.com/watch?v=_uq6Db47-Ks

www.tenon.io

From: Jukka K. Korpela
Date: Mon, Nov 24 2014 9:39AM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

2014-11-24 18:20, Karl Groves wrote:

> Quick question: Which is the proper ISO language code standard
> supported by ATs? ISO 639-1, 639-2, 639-3, or do they all work?

Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work
to the extent that language codes work at all, which basically means a
support to a small set of languages in some software. So lang="en" does
not do harm, probably, as long as the content is actually in English,
and on sunny days it might do some good; but lang="eng" will probably be
mostly ignored, and lang="en-US" might actually work less than lang="en".

It might even be argued (and maybe I'm doing so now) that AT even should
not care about declared language, as it so often wrong; this is what
Google does.

In any case, most AT software can handle only a small set of languages
anyway, so any fine-grained language markup would be wasted.

Yucca

From: Steve Faulkner
Date: Mon, Nov 24 2014 9:54AM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

On 24 November 2014 at 16:39, Jukka K. Korpela < = EMAIL ADDRESS REMOVED = > wrote:

> Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work to
> the extent that language codes work at all, which basically means a support
> to a small set of languages in some software. So lang="en" does not do
> harm, probably, as long as the content is actually in English, and on sunny
> days it might do some good; but lang="eng" will probably be mostly ignored,
> and lang="en-US" might actually work less than lang="en".


NVDA is localized into 45 languages,

but i think it the speech synthesizer that matters more:
language info for espeak which is bundled with NVDA


English Voices

en
is the standard default English voice.

en-us
American English.

en-sc
English with a Scottish accent.

en-n
en-rp
en-wm

are different English voices. These can be considered caricatures of
various British accents: Northern, Received Pronunciation, West Midlands
respectively.

also see 3.4 Other Languages
http://espeak.sourceforge.net/languages.html
--

Regards

SteveF
HTML 5.1 <http://www.w3.org/html/wg/drafts/html/master/>;

From: Olaf Drümmer
Date: Mon, Nov 24 2014 10:03AM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

Usually it should not be too difficult for a software developer to take language codes into account where needed, regardless which flavour they come in.

From the authoring end they should be used in a fashion as specific as necessary. So if it's not clear whether content is strictly US American English, the Queen's English or some other variation, just use "en". If it is relevant to be more specific, use something like "en-US" or "en-GB".

On the background of how indication of the current language might kick in - please do not forget it's not just for voice selection in text to speech, it is also relevant for hyphenation (and spell checking). Given content might be displayed on small form factor screens like those on mobile devices, good hyphenation (based on algorithms in the right language) may become even more relevant.

So by all means, please insert language codes, and encourage others to do so. For most contexts, it should be easy to get this right.

Regarding (many or not) instances of incorrect indication of language, especially tools making use of text to speech, like screen readers, should have an 'override' option that makes it easy for users to override determination of language/voice from information in the content, and instead assume / use a language to the user's liking.

Olaf


On 24 Nov 2014, at 17:39, "Jukka K. Korpela" < = EMAIL ADDRESS REMOVED = > wrote:

> 2014-11-24 18:20, Karl Groves wrote:
>
>> Quick question: Which is the proper ISO language code standard
>> supported by ATs? ISO 639-1, 639-2, 639-3, or do they all work?
>
> Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work to the extent that language codes work at all, which basically means a support to a small set of languages in some software. So lang="en" does not do harm, probably, as long as the content is actually in English, and on sunny days it might do some good; but lang="eng" will probably be mostly ignored, and lang="en-US" might actually work less than lang="en".
>
> It might even be argued (and maybe I'm doing so now) that AT even should not care about declared language, as it so often wrong; this is what Google does.
>
> In any case, most AT software can handle only a small set of languages anyway, so any fine-grained language markup would be wasted.
>
> Yucca
>
>
> > >

From: Karl Groves
Date: Mon, Nov 24 2014 10:22AM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

I'm interested in the practical level of support for language codes
among ATs. While I'm well aware of the ISO standards for the
languages, I'm ignorant of which standards the ATs actually support
and understand.

Steve Faulkner mentioned that the synthesizer matters more, and
provided reference to eSpeak, which states: "Language voices generally
start with the 2 letter ISO 639-1 code for the language. If the
language does not have an ISO 639-1 code, then the 3 letter ISO 639-3
code can be used."

Unfortunately I can't find similar data for other synthesizers.

On Mon, Nov 24, 2014 at 12:09 PM, Haritos-Shea, Katie
< = EMAIL ADDRESS REMOVED = > wrote:
> I use these URLs to point developers to use the language tags from either the ISO Language Codes, or the IANA Language Registry.
> ISO (http://www.sitepoint.com/web-foundations/iso-2-letter-language-codes/)
> IANA (http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry)
>
> * katie *
>
> Katie Haritos-Shea
> Senior Accessibility SME (WCAG/Section 508/ADA), Standards QA Architect
> JPMC dCE eCAT: Visit our Digital Accessibility Knowledge Base (DAKB), your source for JPMC accessibility best practices.
>
> JPMC Digital | Wilmington, DE | = EMAIL ADDRESS REMOVED = | Office: 302-282-1439 | Ext: 21439 | Cell: 703-371-5545 | LinkedIn Profile
>
> -----Original Message-----
> From: = EMAIL ADDRESS REMOVED = [mailto: = EMAIL ADDRESS REMOVED = ] On Behalf Of Steve Faulkner
> Sent: Monday, November 24, 2014 11:55 AM
> To: WebAIM Discussion List
> Subject: Re: [WebAIM] ISO Language codes and AT
>
> On 24 November 2014 at 16:39, Jukka K. Korpela < = EMAIL ADDRESS REMOVED = > wrote:
>
>> Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work
>> to the extent that language codes work at all, which basically means a
>> support to a small set of languages in some software. So lang="en"
>> does not do harm, probably, as long as the content is actually in
>> English, and on sunny days it might do some good; but lang="eng" will
>> probably be mostly ignored, and lang="en-US" might actually work less than lang="en".
>
>
> NVDA is localized into 45 languages,
>
> but i think it the speech synthesizer that matters more:
> language info for espeak which is bundled with NVDA
>
>
> English Voices
>
> en
> is the standard default English voice.
>
> en-us
> American English.
>
> en-sc
> English with a Scottish accent.
>
> en-n
> en-rp
> en-wm
>
> are different English voices. These can be considered caricatures of various British accents: Northern, Received Pronunciation, West Midlands respectively.
>
> also see 3.4 Other Languages
> http://espeak.sourceforge.net/languages.html
> --
>
> Regards
>
> SteveF
> HTML 5.1 <http://www.w3.org/html/wg/drafts/html/master/>;
> > >
> This transmission may contain information that is privileged, confidential, legally privileged, and/or exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution, or use of the information contained herein (including any reliance thereon) is STRICTLY PROHIBITED. Although this transmission and any attachments are believed to be free of any virus or other defect that might affect any computer system into which it is received and opened, it is the responsibility of the recipient to ensure that it is virus free and no responsibility is accepted by JPMorgan Chase & Co., its subsidiaries and affiliates, as applicable, for any loss or damage arising in any way from its use. If you received this transmission in error, please immediately contact the sender and destroy the material in its entirety, whether in electronic or hard copy format. Thank you.
> > > --

Karl Groves
www.karlgroves.com
@karlgroves
http://www.linkedin.com/in/karlgroves
Phone: +1 410.541.6829

Modern Web Toolsets and Accessibility
https://www.youtube.com/watch?v=_uq6Db47-Ks

www.tenon.io

From: Jukka K. Korpela
Date: Mon, Nov 24 2014 10:28AM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

2014-11-24 18:54, Steve Faulkner wrote:

> On 24 November 2014 at 16:39, Jukka K. Korpela < = EMAIL ADDRESS REMOVED = > wrote:
>
>> Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work to
>> the extent that language codes work at all, which basically means a support
>> to a small set of languages in some software. So lang="en" does not do
>> harm, probably, as long as the content is actually in English, and on sunny
>> days it might do some good; but lang="eng" will probably be mostly ignored,
>> and lang="en-US" might actually work less than lang="en".
>
>
> NVDA is localized into 45 languages,
>
> but i think it the speech synthesizer that matters more:
> language info for espeak which is bundled with NVDA

Yes, I think it’s the support to languages in speech synthesis that
matters, rather than the localization of the AT itself (which is an
important issue too, but for other reasons). And what matters here is
whether the AT software automatically switches reading mode according to
lang attributes or otherwise recognizes ISO codes.

> en-sc
> English with a Scottish accent.

That’s interesting. According to authoritative specifications on the use
of language codes, en-sc or, using the preferred spelling, en-SC means
English as spoken in the Seychelles.

But independently of this, it seems that NVDA only recognizes ISO 639-1
(alpha-2) codes for primary languages, not the other ISO 639 codes,
supporting my note.

Yucca

From: John Foliot
Date: Mon, Nov 24 2014 1:31PM
Subject: Re: ISO Language codes and AT
← Previous message | Next message →

Karl Groves wrote:

>

> I'm interested in the practical level of support for language codes

> among ATs. While I'm well aware of the ISO standards for the languages,

> I'm ignorant of which standards the ATs actually support and

> understand.



According to Freedom Scientific, JAWS currently supports the following
languages and language codes:



. French, lang="fr"

. Spanish, lang="es"

. Portuguese, lang="pt"

. German, lang="de"

. Russian, lang="ru"

. Finnish, lang="fi"

. Italian, lang="it"

. Greek, lang="el"

. Polish, lang="pl"

. Chinese, lang="zh-cn"

(source: http://www.freedomscientific.com/Training/Surfs-Up/Languages.htm -
hint: look at the source code :-) )



VoiceOver claims support for the following languages:
http://support.apple.com/en-us/HT201917



.but being as I am towards iFruit hardware, I am not sure if a) VO supports
langauage changes on the fly, b) which ISO codes they support officially,
although this page may give you a hint:
http://support.apple.com/tr-tr/HT3562 - again a peak at the source code
suggests that ISO 639-2 *may* be the answer:

<snip>
<link rel="alternate" hreflang="en-kw"
href="http://support.apple.com/en-kw/HT3562">

<link rel="alternate" hreflang="en-bh"
href="http://support.apple.com/en-bh/HT3562">

<link rel="alternate" hreflang="en-jo"
href="http://support.apple.com/en-jo/HT3562">

<link rel="alternate" hreflang="en-qa"
href="http://support.apple.com/en-qa/HT3562">

</snip>



(There are a whole pile more!)





HTH.



JF

------------------------------

John Foliot
Web Accessibility Specialist
W3C Invited Expert - Accessibility

Co-Founder, Open Web Camp

From: Jukka K. Korpela
Date: Mon, Nov 24 2014 2:16PM
Subject: Re: ISO Language codes and AT
← Previous message | No next message

2014-11-24 22:31, John Foliot wrote:

> this page may give you a hint:
> http://support.apple.com/tr-tr/HT3562 - again a peak at the source code
> suggests that ISO 639-2 *may* be the answer:

I don’t see how the source code relates to ISO 639-2, which is about
three-letter (primary) codes for languages (alpha-3). The language codes
are two-letter codes as defined in ISO 639-1 (alpha-2), partly followed
by a two-letter country code. Country codes, and the way to combine them
with primary language codes, are defined in other standards and
specifications.

> <link rel="alternate" hreflang="en-kw"
> href="http://support.apple.com/en-kw/HT3562">

As an aside, “en-kw” is somewhat strange. I don’t remember having heard
any particular features in English as spoken in Kuwait. The code “en-kw”
appears to be an attempt at combining language information with country
information, rather than identifying a particular form of language.
Though formally valid, this appears to be a category error in practice.
Besides, the page itself declares lang="en", i.e. English in general. By
the protocols, hreflang attributes are advisory, and the language
information provided by the resource itself should have priority.

Yucca