E-mail List Archives

Re: ISO Language codes and AT

for

From: Olaf Drümmer
Date: Nov 24, 2014 10:03AM


Usually it should not be too difficult for a software developer to take language codes into account where needed, regardless which flavour they come in.

From the authoring end they should be used in a fashion as specific as necessary. So if it's not clear whether content is strictly US American English, the Queen's English or some other variation, just use "en". If it is relevant to be more specific, use something like "en-US" or "en-GB".

On the background of how indication of the current language might kick in - please do not forget it's not just for voice selection in text to speech, it is also relevant for hyphenation (and spell checking). Given content might be displayed on small form factor screens like those on mobile devices, good hyphenation (based on algorithms in the right language) may become even more relevant.

So by all means, please insert language codes, and encourage others to do so. For most contexts, it should be easy to get this right.

Regarding (many or not) instances of incorrect indication of language, especially tools making use of text to speech, like screen readers, should have an 'override' option that makes it easy for users to override determination of language/voice from information in the content, and instead assume / use a language to the user's liking.

Olaf


On 24 Nov 2014, at 17:39, "Jukka K. Korpela" < <EMAIL REMOVED> > wrote:

> 2014-11-24 18:20, Karl Groves wrote:
>
>> Quick question: Which is the proper ISO language code standard
>> supported by ATs? ISO 639-1, 639-2, 639-3, or do they all work?
>
> Quick answer: in my limited experience, ISO 639-1 (alpha-2) codes work to the extent that language codes work at all, which basically means a support to a small set of languages in some software. So lang="en" does not do harm, probably, as long as the content is actually in English, and on sunny days it might do some good; but lang="eng" will probably be mostly ignored, and lang="en-US" might actually work less than lang="en".
>
> It might even be argued (and maybe I'm doing so now) that AT even should not care about declared language, as it so often wrong; this is what Google does.
>
> In any case, most AT software can handle only a small set of languages anyway, so any fine-grained language markup would be wasted.
>
> Yucca
>
>
> > >