WebAIM - Web Accessibility In Mind

E-mail List Archives

Thread: JWAS and special characters pronunciation

for

Number of posts in this thread: 19 (In chronological order)

From: Druckman,Geri
Date: Tue, Dec 31 2013 1:36PM
Subject: JWAS and special characters pronunciation
No previous message | Next message →

Hi,

I am testing a medically related site about gene mutations. Some of the text contains special charters in Greek (e.g., the character Alpha).
I have added the symbol and the proper pronunciation to the JAWS dictionary, I tested it with both character encoding α and α both work very well as an individual character, and JAWS will indeed read "alpha", but when it is part of a word it will read it as "ah".

So a gene named p110(enter here a symbol for alpha) will be read as "pe one hundred ten ah", though when arrowed through the word it will read "pe one one zero alpha".

Any ideas how to encode the page (HTML) in such a way, or make JAWS properly read "pe one hundred ten alpha"?
This will help with other scientific paper (encoded in HTML for online reading) that have other special characters.

Thank you, and a Happy New Year!

Geri Druckman
Web Development Specialist - Accessibility
Department of Internet Services
MD Anderson Cancer Center
T 713-792-6293 | F 713-745-8134

From: Birkir R. Gunnarsson
Date: Tue, Dec 31 2013 4:17PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

A crude way to do this is to wrap the character in a <div>.
In very quick testing span does not do the trick (though this is very
brief testing).
This also will require some repositioning with CSS, but this forces
the screen reader (at least Jaws) to read the character by itself.
I would also like to suggest that you add screen reader instructions
to the page, informing the user this article uses a lot of special
charaacters so user needs to make sure the screen reader is set to
pronounce these (for instance, NVDA, ignores most non-alphabet
characters in its default pronounciation setting).
Hope this helps.
Happy 2014
-B

On 12/31/13, Druckman,Geri < = EMAIL ADDRESS REMOVED = > wrote:
> Hi,
>
> I am testing a medically related site about gene mutations. Some of the text
> contains special charters in Greek (e.g., the character Alpha).
> I have added the symbol and the proper pronunciation to the JAWS dictionary,
> I tested it with both character encoding &alpha; and &#945; both work very
> well as an individual character, and JAWS will indeed read "alpha", but when
> it is part of a word it will read it as "ah".
>
> So a gene named p110(enter here a symbol for alpha) will be read as "pe one
> hundred ten ah", though when arrowed through the word it will read "pe one
> one zero alpha".
>
> Any ideas how to encode the page (HTML) in such a way, or make JAWS properly
> read "pe one hundred ten alpha"?
> This will help with other scientific paper (encoded in HTML for online
> reading) that have other special characters.
>
> Thank you, and a Happy New Year!
>
> Geri Druckman
> Web Development Specialist - Accessibility
> Department of Internet Services
> MD Anderson Cancer Center
> T 713-792-6293 | F 713-745-8134
> > > >


--
Work hard. Have fun. Make history.

From: Olaf Drümmer
Date: Tue, Dec 31 2013 4:47PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Wouldn't inserting a zero width space do the trick? Though strictly speaking it is breaking the one word into two words though they still look like one to the sighted user (but be also aware that searching for the one word might begin to fail...).

In general I would like to add that it is more important to fix less than well working tools instead of hacking around in content to avoid limitations of one out of many tools. But that's just my personal point of view…

Olaf



Am 1 Jan 2014 um 00:17 schrieb "Birkir R. Gunnarsson" < = EMAIL ADDRESS REMOVED = >:

> A crude way to do this is to wrap the character in a <div>.
> In very quick testing span does not do the trick (though this is very
> brief testing).
> This also will require some repositioning with CSS, but this forces
> the screen reader (at least Jaws) to read the character by itself.
> I would also like to suggest that you add screen reader instructions
> to the page, informing the user this article uses a lot of special
> charaacters so user needs to make sure the screen reader is set to
> pronounce these (for instance, NVDA, ignores most non-alphabet
> characters in its default pronounciation setting).
> Hope this helps.
> Happy 2014
> -B
>
> On 12/31/13, Druckman,Geri < = EMAIL ADDRESS REMOVED = > wrote:
>> Hi,
>>
>> I am testing a medically related site about gene mutations. Some of the text
>> contains special charters in Greek (e.g., the character Alpha).
>> I have added the symbol and the proper pronunciation to the JAWS dictionary,
>> I tested it with both character encoding &alpha; and &#945; both work very
>> well as an individual character, and JAWS will indeed read "alpha", but when
>> it is part of a word it will read it as "ah".
>>
>> So a gene named p110(enter here a symbol for alpha) will be read as "pe one
>> hundred ten ah", though when arrowed through the word it will read "pe one
>> one zero alpha".
>>
>> Any ideas how to encode the page (HTML) in such a way, or make JAWS properly
>> read "pe one hundred ten alpha"?
>> This will help with other scientific paper (encoded in HTML for online
>> reading) that have other special characters.
>>
>> Thank you, and a Happy New Year!
>>
>> Geri Druckman
>> Web Development Specialist - Accessibility
>> Department of Internet Services
>> MD Anderson Cancer Center
>> T 713-792-6293 | F 713-745-8134
>> >> >> >>
>
>
> --
> Work hard. Have fun. Make history.
> > > ---
Olaf Drümmer
Florastraße 37
13187 Berlin
Tel 030.42022239
Fax 030.42022240
= EMAIL ADDRESS REMOVED =

From: Birkir R. Gunnarsson
Date: Tue, Dec 31 2013 7:11PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

I totally agree that we cannot always shift all responsibility to the
content developer.The problem with this particular conundrum is that
the screen readers have one set of rules but TTS voices used often
have their own independent rules (painful lesson learnt from working
with developer to design a TTS engine for Icelandic).
What all screen readers should uniformally support is to announce a
character differently when put inside a span, it should not take a
Blockk level element to get that done.
Btw. if you have a copy of an online example, I would happily file a
bug both with Freedom Scientific, makers of Jaws, and NVDA.
What we need to do more of is to alert assistive technology vendors of
issues for end users and content developers, so they are aware of the
problems and have a chance to react to them.
It does not solve the problem for you today, but it could ensure that
next year you and everybody else does not have to invent hacks to get
around them.
Cheers
-B


On 12/31/13, Olaf Drümmer < = EMAIL ADDRESS REMOVED = > wrote:
> Wouldn't inserting a zero width space do the trick? Though strictly speaking
> it is breaking the one word into two words though they still look like one
> to the sighted user (but be also aware that searching for the one word might
> begin to fail...).
>
> In general I would like to add that it is more important to fix less than
> well working tools instead of hacking around in content to avoid limitations
> of one out of many tools. But that's just my personal point of view…
>
> Olaf
>
>
>
> Am 1 Jan 2014 um 00:17 schrieb "Birkir R. Gunnarsson"
> < = EMAIL ADDRESS REMOVED = >:
>
>> A crude way to do this is to wrap the character in a <div>.
>> In very quick testing span does not do the trick (though this is very
>> brief testing).
>> This also will require some repositioning with CSS, but this forces
>> the screen reader (at least Jaws) to read the character by itself.
>> I would also like to suggest that you add screen reader instructions
>> to the page, informing the user this article uses a lot of special
>> charaacters so user needs to make sure the screen reader is set to
>> pronounce these (for instance, NVDA, ignores most non-alphabet
>> characters in its default pronounciation setting).
>> Hope this helps.
>> Happy 2014
>> -B
>>
>> On 12/31/13, Druckman,Geri < = EMAIL ADDRESS REMOVED = > wrote:
>>> Hi,
>>>
>>> I am testing a medically related site about gene mutations. Some of the
>>> text
>>> contains special charters in Greek (e.g., the character Alpha).
>>> I have added the symbol and the proper pronunciation to the JAWS
>>> dictionary,
>>> I tested it with both character encoding &alpha; and &#945; both work
>>> very
>>> well as an individual character, and JAWS will indeed read "alpha", but
>>> when
>>> it is part of a word it will read it as "ah".
>>>
>>> So a gene named p110(enter here a symbol for alpha) will be read as "pe
>>> one
>>> hundred ten ah", though when arrowed through the word it will read "pe
>>> one
>>> one zero alpha".
>>>
>>> Any ideas how to encode the page (HTML) in such a way, or make JAWS
>>> properly
>>> read "pe one hundred ten alpha"?
>>> This will help with other scientific paper (encoded in HTML for online
>>> reading) that have other special characters.
>>>
>>> Thank you, and a Happy New Year!
>>>
>>> Geri Druckman
>>> Web Development Specialist - Accessibility
>>> Department of Internet Services
>>> MD Anderson Cancer Center
>>> T 713-792-6293 | F 713-745-8134
>>> >>> >>> >>>
>>
>>
>> --
>> Work hard. Have fun. Make history.
>> >> >> >
> ---
> Olaf Drümmer
> Florastraße 37
> 13187 Berlin
> Tel 030.42022239
> Fax 030.42022240
> = EMAIL ADDRESS REMOVED =
>
>
>
> > > >


--
Work hard. Have fun. Make history.

From: Chagnon | PubCom
Date: Wed, Jan 01 2014 5:44PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Birkir wrote:
" What all screen readers should uniformly support is to announce a character differently when put inside a span, it should not take a Block level element to get that done. "

Whatever solution is developed by the industry, it needs to also work for non-HTML documents, such as MS Word, PowerPoint, and Acrobat PDFs.

Right now, putting a span tag requires hand-tooling in the resulting PDF file, not a very efficient method for the millions (and probably billions) of ordinary documents created every day. The native Word and PowerPoint files need to be just as accessible as the PDF exported from them, as well as their HTML counterparts.

1) We need to broaden our focus: all information should be accessible, not just HTML websites, and we need to encourage the key players (which are the AT manufacturers, Microsoft, and Adobe) to develop solutions that will work for HTML, Word, PDF, ePUBs and forthcoming technologies. Hand-coding a span tag around these elements won't get done by ordinary workers who create the majority of documents.

2) We also need to use Unicode to its full capabilities for these characters.

There are different Unicode characters for the Greek letter pi used in written material (Unicode 03C0) versus the mathematical symbol pi used in formulae (Unicode 03D6), although both appear visually the same to the human eye.

Similar for all sorts of dashes; the hyphen has about 12 variations but the normal hyphen is Unicode 2010, the mathematical minus sign Unicode 2212, the en-dash is Unicode 2013, and the em-dash is Unicode 2014. Each of these glyphs has a different purpose in language and technical documents.

It would help if we and the industry could develop standards for how these variations will be voiced and treated by AT. One solution is for screen readers to pick up the Unicode name from the character when they encounter it.

In a series of technical documents we just completed for a client, plus and minus signs peppered the narrative, as in "adults 21+" and "a co-efficient of −.125". Our screen reader testers didn't even know that the characters were there and so they misread a great deal of the information in the documents. AT users shouldn't have to play mindreader and figure out that they have to force their technology to voice individual characters: instead, we the document creators, need a way to signal all technologies to voice the character with its Unicode name.

And that has to be done in the source document, such as MS Word, Adobe InDesign, and an HTML editor.

—Bevi Chagnon

— PubCom.com — Trainers, Consultants, Designers, and Developers.
— Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508 Accessibility.

From: Birkir R. Gunnarsson
Date: Wed, Jan 01 2014 7:10PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

I agree.
I, myself, stumbled into the accessibility arena because I could not
obtain accessible CFA/FRM certification study materials, though it was
probably a blessing in disguise.
http://www.daisy.org/stories/birkir-gunnarsson-part-2

I think access to STEM materials (and we are talking STEM content
here, even if the format is not that of STEM instructional material
per se) is a huge problem and prevents millions of people with
disabilities from pursuing exciting, innovative and profitable
careers.
Having dabbled in math accessibility as well, I can attest to the fact
that A.T. vendors generally do not treat math/STEM accessibility it as
a high priority, because so few users request it.
The reason so few users do that is that many are discouraged and never
try to pursue a STEM degree/career, so we have a classic chicken and
egg problem (which I wish we could turn into a bacon and egg problem).
Partially I am still pinning my hopes on increased adoption of the
MathML standard for pure math, but you are right in that document
authors really need a way to tell screen readers what content should
be read, leaving it entirely to the user and the user default setting
is a dangerous game.
I know CSS3 has some sort of speech control, but I do not claim nearly
sufficient expertese to understand that, or to what degree it is
applicable (though it is on my wishlist of topics to master in 2014,
along with a lot of other things). PDF/Word/other formats require
their own standards, and screen readers need to converge on a standard
as well, and the whole interface needs to be supported by authoring
applications, rendering apps, possibly the O.S. as well as the screen
reader, not to mention that the TTS used by the user may have its own
pronounciation rules which can override the screen reader default. so
we are facing a gigantic mess of standards and technologies, and STEM
definitely pushes the limit.
I am hoping ePub3 along with MathML may start pushing the envelope in
this area, but I honestly am at aloss as to seeing the big picture of
how this could realistically be achieved.
My initial answer was supposed to be limited in scope to how this
particular problem could be addressed for this particular situation,,
but it is hard not to turn it into a long rant about how important it
is that we fix the bigger picture I admit. Reading your post was
refreshing and I found myself grinning inanely and even saying the
words "heck yeah" out loud.
If you have some ideas and info or relevant links on how to address
the bigger picture, feel free to contact me on or off-list with links
to such. I am always curious to keep abreast of the latest and
greatest in document accessibility.
All that being said, I have to be careful not to stray too far into
how to fix the world discussion on this list, so I will stop here.
*grin*
Happy ew year everyone!





On 1/1/14, Chagnon | PubCom < = EMAIL ADDRESS REMOVED = > wrote:
> Birkir wrote:
> " What all screen readers should uniformly support is to announce a
> character differently when put inside a span, it should not take a Block
> level element to get that done. "
>
> Whatever solution is developed by the industry, it needs to also work for
> non-HTML documents, such as MS Word, PowerPoint, and Acrobat PDFs.
>
> Right now, putting a span tag requires hand-tooling in the resulting PDF
> file, not a very efficient method for the millions (and probably billions)
> of ordinary documents created every day. The native Word and PowerPoint
> files need to be just as accessible as the PDF exported from them, as well
> as their HTML counterparts.
>
> 1) We need to broaden our focus: all information should be accessible, not
> just HTML websites, and we need to encourage the key players (which are the
> AT manufacturers, Microsoft, and Adobe) to develop solutions that will work
> for HTML, Word, PDF, ePUBs and forthcoming technologies. Hand-coding a span
> tag around these elements won't get done by ordinary workers who create the
> majority of documents.
>
> 2) We also need to use Unicode to its full capabilities for these
> characters.
>
> There are different Unicode characters for the Greek letter pi used in
> written material (Unicode 03C0) versus the mathematical symbol pi used in
> formulae (Unicode 03D6), although both appear visually the same to the human
> eye.
>
> Similar for all sorts of dashes; the hyphen has about 12 variations but the
> normal hyphen is Unicode 2010, the mathematical minus sign Unicode 2212, the
> en-dash is Unicode 2013, and the em-dash is Unicode 2014. Each of these
> glyphs has a different purpose in language and technical documents.
>
> It would help if we and the industry could develop standards for how these
> variations will be voiced and treated by AT. One solution is for screen
> readers to pick up the Unicode name from the character when they encounter
> it.
>
> In a series of technical documents we just completed for a client, plus and
> minus signs peppered the narrative, as in "adults 21+" and "a co-efficient
> of -.125". Our screen reader testers didn't even know that the characters
> were there and so they misread a great deal of the information in the
> documents. AT users shouldn't have to play mindreader and figure out that
> they have to force their technology to voice individual characters: instead,
> we the document creators, need a way to signal all technologies to voice the
> character with its Unicode name.
>
> And that has to be done in the source document, such as MS Word, Adobe
> InDesign, and an HTML editor.
>
> --Bevi Chagnon
>
> -- PubCom.com -- Trainers, Consultants, Designers, and Developers.
> -- Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
> Accessibility.
>
>

From: Chagnon | PubCom
Date: Wed, Jan 01 2014 11:01PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Great comments, Birkir.
I think the world is ripe for pressing for accessible STEM materials. This
could easily be a successful campaign that would benefit all AT users and
open the doors for many students to enter higher-paying tech and science
jobs. What a revolution that would be for the community!

<< PDF/Word/other formats require their own standards >>
I'm not going to assume that.
In fact, I think Word, PowerPoint, InDesign, and Acrobat PDF should have the
same accessibility tags and standards as HTML.

Example: I have never been able to find a good reason why the PDF
accessibility tag for lists is the generic <L>, while HTML defines 2 list
tags, <UL> and <OL>. It's as if Adobe invented its own wheel rather than
using HTML's perfectly good wheel. No additional value was added by
switching to <L>, and the consequence to AT developers is that they now must
write code to recognize all three variations of list tags.

The accessible STEM issue is large, but I believe if we break it down into
logical segments, that is identify the accessibility feature that would have
the greatest impact with the least cost or retooling, it can be done, one
step at a time.

To me, accurate voicing and recognition of Unicode characters is the first
step because 1) almost everything else hinges on it, including Math ML, and
2) the identifiers are already defined by the Unicode standard itself.
Example: the 2 pi characters, one for Greek language and the other for math.
We just need the AT manufacturers to pick up that info from the Unicode
font. It would also help to get this concept of using the correct Unicode
character as a standard in WCAG.

And then we have to teach those who create documents how to use the correct
character in their documents. Not difficult, but it does require some
training to understand why they have to choose the correct pi and how to
choose the correct pi.

There are hundreds more characters like the 2 pi characters in Unicode.

—Bevi Chagnon

PS: Birkir, have you had bacon and egg cups? Had them this morning for New
Year's brunch. It's a muffin-shaped bacon cup (also called a bacon bowl)
filled with egg, cheese, guacamole, etc. My apologies to the vegans and
vegetarians on the list. And to the dieters, too.
http://www.instructables.com/id/Mini-Bacon-Cups/ and
https://www.buyperfectbacon.com/

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com — Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New Sec. 508 Workshop & EPUBs Tour in 2013 — www.Workshop.Pubcom.com


From: Jukka K. Korpela
Date: Thu, Jan 02 2014 12:51AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

2014-01-02 2:44, Chagnon | PubCom wrote:

> Birkir wrote: " What all screen readers should uniformly support is
> to announce a character differently when put inside a span, it should
> not take a Block level element to get that done."
>
> Whatever solution is developed by the industry, it needs to also work
> for non-HTML documents, such as MS Word, PowerPoint, and Acrobat
> PDFs.

That would be nice, but strings like "p110α" are a real challenge to
software that tries to pronounce it properly. Knowing the context, you
might think that it is self evident that the letter alpha is to be read
as "alpha", pronounced in by the rules of the enclosing language. But in
general, a string consisting of Latin letters, digits, and Greek letters
might have the last part to be read as Greek text. And then reading
alpha as "ah" is actually natural.

It is difficult to formulate a good algorithm for dealing with strings
like that. And I suppose they are relatively rare. Strings like
"α-Tocopherol" are more common. I wonder if they are handled properly;
at least there is the hyphen, which suggests some kind of (morpheme)
boundary.

> There are different Unicode characters for the Greek letter pi used
> in written material (Unicode 03C0) versus the mathematical symbol pi
> used in formulae (Unicode 03D6), although both appear visually the
> same to the human eye.

No, Unicode 03D6 is GREEK PI SYMBOL "ϖ" which is looks more or less like
small omega. It is used as a technical symbol, but I don't think any
standard assigns any specific meaning to it; it has various meanings in
different fields of physics, and it is not replaceable by the normal
letter pi. The common mathematical constant 3.1416... is denoted by the
normal Greek letter small pi.

> Similar for all sorts of dashes; the hyphen has about 12 variations
> but the normal hyphen is Unicode 2010, the mathematical minus sign
> Unicode 2212, the en-dash is Unicode 2013, and the em-dash is Unicode
> 2014. Each of these glyphs has a different purpose in language and
> technical documents.

And they are widely confused with each other.

> It would help if we and the industry could develop standards for how
> these variations will be voiced and treated by AT. One solution is
> for screen readers to pick up the Unicode name from the character
> when they encounter it.

That would be a wrong move, in general. First, Unicode names have not
been designed for such use. They are symbolic identifiers for
characters. Second, they consist of English or anglicized words and are
quite unsuitable when the text language is not English. Speech synthesis
might need to fall back to saying the Unicode name when there is no
other useful information, but it's really just fallback.

For example, "+" should normally be read as "plus" in English, not as
its Unicode name "plus sign". Speech synthesizers really need tables of
names of (or pronunciations for) special characters. And the names are
something that various language communities should define, and register
somewhere, so that software vendors can pick them up. (Unfortunately,
CLDR, the Common Locale Data Repository, though it provides localized
names for many things, does not address names of characters yet.)

> In a series of technical documents we just completed for a client,
> plus and minus signs peppered the narrative, as in "adults 21+" and
> "a co-efficient of −.125". Our screen reader testers didn't even
> know that the characters were there and so they misread a great deal
> of the information in the documents. AT users shouldn't have to play
> mindreader and figure out that they have to force their technology to
> voice individual characters: instead, we the document creators, need
> a way to signal all technologies to voice the character with its
> Unicode name.

I think you would want "−.125" to be read as "minus point one two five"
rather than "minus sign full stop digit one digit two digit five".

In practice, at least on web pages, we encounter "-.125" much more often
than "−.125". It's difficult to say how the Ascii hyphen, or
"hyphen-minus" to use the Unicode name, should be pronounced in
different contexts. Probably it should be read as "hyphen" when it is
not apparently part of a hyphenated work or a standalone symbol
surrounded by spaces (in which case it should probably be treated as a
punctuation dash).

Yucca

From: Olaf Drümmer
Date: Thu, Jan 02 2014 6:48AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

… sorry, I have to jump on here: ;-)

Am 2 Jan 2014 um 07:01 schrieb "Chagnon | PubCom" < = EMAIL ADDRESS REMOVED = >:

> Example: I have never been able to find a good reason why the PDF
> accessibility tag for lists is the generic <L>, while HTML defines 2 list
> tags, <UL> and <OL>. It's as if Adobe invented its own wheel rather than
> using HTML's perfectly good wheel. No additional value was added by
> switching to <L>, and the consequence to AT developers is that they now must
> write code to recognize all three variations of list tags.

the main reason probably is that HTML needs to be able to distinguish between ordered and unordered list, in order to create the proper bullets or numbering...

In PDF, all is said and done in this regard - the bullets or numbers or whatever are already part of the page content, no hint needed as to how to generate them. Rather the bullet or numbering is enclosed in a Lbl tag. BTW - this is much more flexible than HTML which may have to use (presentational) CSS hacks where PDF can actually remain more semantic ;-)


Those wishing to give a suitable hint to a tool repurposing the PDF into HTML are encouraged to use the ListNumbering attribute for the given list.



Olaf

From: Bourne, Sarah (ITD)
Date: Thu, Jan 02 2014 8:40AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Imagine I were trying to search for "p110á" (with or without AT.) Any workaround for screenreaders that breaks that string into separate "words" could break its findability by search. For instance, inserting even a very, teeny, tiny space would make it "p110" and "á" to a search engine. I would lean towards including instructions for screen reader users, such as recommendations for configuring the dictionary and suggesting reading non-common words or numbers character-by-character when accuracy is essential.

It might stretch the definition of "abbreviation," but perhaps you could use ABBR, and have the TITLE spell out the correct pronunciation? For instance, <abbr title="p 100 alpha"> p110á </abbr>

(I have no idea how search engines handle multiple Unicode names and other encodings for characters that appear to be the same. That makes my head to hurt to think of!)

sb
Sarah E. Bourne
Director of Assistive Technology &
Mass.Gov Chief Technology Strategist
Information Technology Division
Commonwealth of Massachusetts
1 Ashburton Pl. rm 1601 Boston MA 02108
617-626-4502
= EMAIL ADDRESS REMOVED =
http://www.mass.gov/itd

From: Chagnon | PubCom
Date: Thu, Jan 02 2014 9:39AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Good points by all.

Sarah wrote: "It might stretch the definition of "abbreviation," but perhaps
you could use ABBR, and have the TITLE spell out the correct pronunciation?
For instance, <abbr title="p 100 alpha"> p110á </abbr>"

Or maybe a new tag where the author can designate how the character should
be pronounced. That could solve these issues:

- Screen reader doesn't recognize the character, doesn't voice it all,
essentially makes it invisible. The new tag could force the screen reader to
voice it.

- Incorrect Unicode character used, such language pi versus mathematical pi.
The new tag would voice it as intended by the author.

- English voicing in a non-English document. The new tag could designate the
correct pronunciation in the document's language, rather than English.

One other factor: we still don't have the tools to do this type of tagging -
not even ABBR - in MS Word, PowerPoint, InDesign and other source documents.
And we also need to have this tag correctly translated into the PDFs
exported from these source programs.

So maybe a new tag like <PRONOUNCE> added to the standard, plus the tools in
authoring programs to apply the tag, and Acrobat retaining and recognizing
the tag would do the trick.

-Bevi Chagnon
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com - Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New Sec. 508 Workshop & EPUBs Tour in 2013 - www.Workshop.Pubcom.com

From: Druckman,Geri
Date: Thu, Jan 02 2014 11:01AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

So after testing with a dash, and hidden text space (using display: none;
for a hidden space/gap still read in JAWS as "pe one hundred ten ah"),
which neither worked the way I wanted it, I tried also Sarah's suggestion
using abbreviation, and <abbr title="p110 &alpha;">p110&alpha;</abbr>
(note the space in the abbreviation title attribute) worked perfectly!
Thank you Sarah for that! And it's also searchable in the proper string
format.

There's minor catch, reading abbreviations is not enabled by default in
JAWS. Without enabling it JAWS ignores the existence of the <abbr> tag.
Maybe Freedom Scientific should make it a default checked option, or is
there a specific reason why it is unchecked by default?.

Now all that said, as Murphy's law says "If you have 10 problems, and you
find a solution for each one of those 10 problems, the 10th solution will
generate an 11th problem". This solution works well in Internet Explorer
where the <abbr> tag is not showing as underlined to none A.T. users
(Chorme doesn't show it, and JAWS doesn't "see" in Chrome either, but then
again, JAWS is optimized for Internet Explorer). In FireFox on the other
hand, it's a different story, first JAWS does not read the <abbr> tag when
using FireFox, and second FireFox will abbreviations as underlined, since
<abbr> is using the title attribute, which makes this solution, for
sighted FireFox users, look awkward.

So what is the lesser "evil"? A solution that works in one particular
browser for A.T. users, but will make a page look awkwardly "peppered"
with underlined, meaningless words to sighted users in a different browser?

After doing some googling (Thank you Larry and Sergey, what would I have
done without you guys?!), here's is what I came up with, which works
nicely in Internet Explorer, and doesn't make it look ugly in FireFox.
Sadly this will not work in anything else but web pages, but it's a start.

In the HTML I used:

<abbr title="p110 &alpha;">p110&alpha;</abbr>

Then to remove the line decoration (Hint: it's NOT {text-decoration:
none;}), I added in the CSS:

abbr[title] {border-bottom-width: 0;}

Eureka! It works!

Thank you all for your valuable input!


Geri Druckman

Web Development Specialist - Accessibility
Department of Internet Services
MD Anderson Cancer Center
T 713-792-6293 | F 713-745-8134






On 1/2/14 9:40 AM, "Bourne, Sarah (ITD)" < = EMAIL ADDRESS REMOVED = > wrote:

>Imagine I were trying to search for "p110$B&A(B" (with or without AT.) Any
>workaround for screenreaders that breaks that string into separate
>"words" could break its findability by search. For instance, inserting
>even a very, teeny, tiny space would make it "p110" and "$B&A(B" to a search
>engine. I would lean towards including instructions for screen reader
>users, such as recommendations for configuring the dictionary and
>suggesting reading non-common words or numbers character-by-character
>when accuracy is essential.
>
>It might stretch the definition of "abbreviation," but perhaps you could
>use ABBR, and have the TITLE spell out the correct pronunciation? For
>instance, <abbr title="p 100 alpha"> p110$B&A(B </abbr>
>
>(I have no idea how search engines handle multiple Unicode names and
>other encodings for characters that appear to be the same. That makes my
>head to hurt to think of!)
>
>sb
>Sarah E. Bourne
>Director of Assistive Technology &
>Mass.Gov Chief Technology Strategist
>Information Technology Division
>Commonwealth of Massachusetts
>1 Ashburton Pl. rm 1601 Boston MA 02108
>617-626-4502
> = EMAIL ADDRESS REMOVED =
>http://www.mass.gov/itd
>
>>>

From: Chagnon | PubCom
Date: Thu, Jan 02 2014 11:51AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Olaf wrote: "the main reason probably is that HTML needs to be able to
distinguish between ordered and unordered list, in order to create the
proper bullets or numbering...In PDF, all is said and done in this regard -
the bullets or numbers or whatever are already part of the page content..."

I understand, but I still don't think this justifies having 2 different sets
of tags for different file formats.
Does either set of code provide a better experience for AT users?
When reading a PDF, would screen reader users like to hear "bulleted list"
and "numbered list" rather than just "list" and a bunch of label jibberish
that often isn't voiced?

..
We're asking billions of people around the world to make a substantial
change to the way they do their everyday job.

We're asking them — and sometimes legislating them — to make an accessible
document.

If we want them to go the extra mile and make their documents accessible for
us, then we'd better make it as simple as possible for them to do so.
Otherwise we won't get their psychological "buy-in," making it more
difficult to achieve our goal: to provide equal access to all forms of
information to all people with disabilities.

If I want authors, editors, and designers who create content to change their
behavior and make accessible documents, why tell them to use <UL> and <OL>
when they're making the HTML version of the document, and <L> when they're
creating a PDF? This makes it less easy, more confusing to the average
writer.

Considering that the people who create these documents must create a
bazillion of them every day in all sorts of file formats, it's better to
streamline the standards and have all formats use the same set of tags for
accessibility. Since HTML was addressing accessibility standards long before
everyone else, they set the standard. It doesn't do any good to ignore that
standard later down the road and create a different set of rules for PDF.

I teach several thousand people a year how to make accessible documents. At
some point in the training, every student asks why it's the <L> tag in PDF
while everywhere else it's <UL> and <OL>.

I don't have a good explanation for them.

—Bevi Chagnon
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com — Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New Sec. 508 Workshop & EPUBs Tour in 2013 — www.Workshop.Pubcom.com


From: Chagnon | PubCom
Date: Thu, Jan 02 2014 12:02PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Geri wrote: "Sadly this will not work in anything else but web pages, but
it's a start."

Thanks, Geri. Great workaround for HTML.

Maybe 2014 can become the year we focus on making documents accessible?

So much vital information and content is hidden from the AT community, and
MS Office and InDesign lack many tools to do the job right.

I keep wondering when the accessibility lawsuits will start.

-Bevi Chagnon
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com - Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New Sec. 508 Workshop & EPUBs Tour in 2013 - www.Workshop.Pubcom.com


From: Duff Johnson
Date: Thu, Jan 02 2014 3:06PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

> Olaf wrote: "the main reason probably is that HTML needs to be able to
> distinguish between ordered and unordered list, in order to create the
> proper bullets or numbering...In PDF, all is said and done in this regard -
> the bullets or numbers or whatever are already part of the page content..."
>
> I understand, but I still don't think this justifies having 2 different sets
> of tags for different file formats.

Recall that HTML (certainly, prior to HTML 5) is simply dumbed-down SGML, a language that *is* (unlike HTML) eminently capable of representing most any sort of textual STEM content.

When people use SGML, the end-product (ironically enough) is usually a PDF.

Another way to look at it is this: Since there's a lot of information in the world that is and will continue to be represented in ways other than as web-pages, as a consequence, there are more than one important formats AT developers need to support.

The big, wide ever-morphing world of HTML/CSS/JavaScript is clearly one; no-one says otherwise. The specific, largely static and otherwise highly constrained world of Tagged PDF, however, is clearly another.

PDF works for STEM publishing precisely because it's completely flexible in terms of representation and highly flexible (if presently less well supported by most AT) in terms of semantics.

It is not a hit on HTML to point out that it's not infrequently inadequate (or the browsers are inadequate, take your pick) for STEM publishing needs. Getting away from such complexities was (and remains) part of HTML's original charm vs. SGML.

> Does either set of code provide a better experience for AT users?
> When reading a PDF, would screen reader users like to hear "bulleted list"
> and "numbered list" rather than just "list" and a bunch of label jibberish
> that often isn't voiced?

When we are about choosing to support various formats we are talking about technical aspirations. In this context, I'd say that what AT users need, in principle, is to be able to receive the content the author provided.

That content may be plain text or it may be rich in one or more of many ways - that's simply impossible to circumscribe (especially when we're discussing STEM content).

> If I want authors, editors, and designers who create content to change their
> behavior and make accessible documents, why tell them to use <UL> and <OL>
> when they're making the HTML version of the document, and <L> when they're
> creating a PDF? This makes it less easy, more confusing to the average
> writer.

This sort of question - resolving tags over here vs. attributes over there - is for implementers to solve. If authoring software was done right it would be transparent to the users who don't want to know what's going on under the hood.

> Considering that the people who create these documents must create a
> bazillion of them every day in all sorts of file formats, it's better to
> streamline the standards and have all formats use the same set of tags for
> accessibility. Since HTML was addressing accessibility standards long before
> everyone else, they set the standard. It doesn't do any good to ignore that
> standard later down the road and create a different set of rules for PDF.

In my view that's equivalent to stating that all documents should be HTML. Even if it was true in theory (and there are good arguments against), it's not what we see in practice. The world appears to need PDF - it uses the stuff more and more, as Google Trends continues to make clear...

http://duff-johnson.com/2013/02/22/apparently-pdf-isnt-boring/

> I teach several thousand people a year how to make accessible documents. At
> some point in the training, every student asks why it's the <L> tag in PDF
> while everywhere else it's <UL> and <OL>.
>
> I don't have a good explanation for them.

As Olaf pointed out, check out the definition of the ListNumbering attribute (Table 347) in ISO 32000. PDF enables a rich expression of list labels.

I'm not saying it's perfect, and I'm not saying AT supports this sort of thing today. But if we're asking… what "should" AT developers do, I think the answer's pretty clear they "should" go ahead and fully support PDF instead of pretending the world will all of a sudden decide that a final-form document is somehow no longer important. Really, how likely is that?

On the other hand, we already see major API developers providing advanced support for tagged PDF and PDF/UA, more desktop applications to help author and consume tagged PDF. PDF 2.0, which we should hopefully see in 2015, will lay the bedrock for advanced implementations utilizing MathML and much more.

It's certainly true that not all implementations follow desirable standards. Ask the vendors for better, and get others to do likewise! That's precisely how this stuff changes.

Duff Johnson

p +1.617.283.4226
e = EMAIL ADDRESS REMOVED =
w http://duff-johnson.com

From: Olaf Drümmer
Date: Thu, Jan 02 2014 5:06PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Hi Bevie,

with all due (and honest) respect:

Am 2 Jan 2014 um 19:51 schrieb "Chagnon | PubCom" < = EMAIL ADDRESS REMOVED = >:

> If I want authors, editors, and designers who create content to change their
> behavior and make accessible documents, why tell them to use <UL> and <OL>
> when they're making the HTML version of the document, and <L> when they're
> creating a PDF? This makes it less easy, more confusing to the average
> writer.

this is not how most people create documents. Instead, in the better case, they'd use some 'make this a list and use bullets' or 'make this a list and use numbering' buttons. (In the worse case they put bullets or numbers at the beginning of paragraphs or new lines.)

Then it's up to the tool that saves out the HTML or PDF or … to get the tags right. Yes, the tag sets are organised slightly differently, but it's not big deal. It's only a few developers who have to get the coding right, not the actual user.

Inserting/applying the tags directly/manually is the exception, and should not be the basis for deriving whatsoever rules or recommendations for the general use case.

Olaf

From: Chagnon | PubCom
Date: Thu, Jan 02 2014 10:48PM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Olaf wrote: "this is not how most people create documents... Then it's up to
the tool that saves out the HTML or PDF or . to get the tags right... It's
only a few developers who have to get the coding right, not the actual
user."

You're right. Most writers, editors, and other content creators aren't
tagging their documents.

But they should.

The publishing workflow is adapting to new technologies and accessibility
can - and should - meld with it. In large enterprises such as government
agencies and corporations, school systems, and publishers, they're working
toward a multi-channel publishing model where a PDF, HTML webpage, and an
EPUB are generated from the same content. Automated content management
systems are becoming the norm.

Think of this as homogenization of communication across all media, including
accessibility and tags.

To make this work, writers and designers have to learn the basics of tagging
and understand that when they use the Heading 1 formatting style in MS Word,
it will generate an H1 tag in HTML, PDF, and EPUB.

They also need to do a basic check of the exported PDF and EPUB file to
ensure that at least the basic tags are correct. And they need to check and
fix their PDFs. OK, leave the hard complicated stuff to accessibility
experts, but designating that the Greek letter alpha be voiced as alpha and
not ah should be done by the writer with a tool or tag or whatever in MS
Word, while they're writing the document.

Because we don't have good enough tools and we use a backwards workflow,
these organizations end up with thousands of documents waiting for the
accessibility experts to correct. One of my large clients has a backlog of
over a year's worth of documents waiting to be made accessible. Some of
their documents take 2-3 days to correct. Some documents require that the
accessibility expert recreate it to correct the bozo junk the creator did to
it.

And the labor costs have the company wondering if all that money is worth
it. The agency they work for is looking into using the undue burden loophole
in Section 508.

Is this what we want for accessibility?
Stalemate and not progress?
Barriers rather than open doors?

Right now the company is posting non-accessible versions with a notice that
the accessible versions will be available "soon." Read "soon" as "one year."
That means that AT users won't have full access to that information until a
year after everyone else does. This is a mindboggling backlog of documents
waiting to be made accessible. And I find it everywhere, not just this one
client I used as an example.

Accessibility requires much more than the PDF tool to get the tags right
when the PDF is exported from MS Office and InDesign.

Whoever creates the original source document needs to create a good quality,
correctly styled, formatted document. These tasks can't be left for the
coders and accessibility specialists. It's too late in the workflow. (Karen
McCall and others, feel free to chime in!)

So not only do MS Office and InDesign need to give writers and designers the
tools to create a decent source document, but document creators need to take
responsibility for making them accessible, at least for the basics.

Olaf, we have to find a better way to get the job done.
What we have now isn't working, it's still shutting out the majority of
information from AT users and that's not acceptable for me.

-Bevi Chagnon
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - -
www.PubCom.com - Trainers, Consultants, Designers, Developers.
Print, Web, Acrobat, XML, eBooks, and U.S. Federal Section 508
Accessibility.
New Sec. 508 Workshop & EPUBs Tour in 2013 - www.Workshop.Pubcom.com

From: Olaf Drümmer
Date: Fri, Jan 03 2014 3:00AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | Next message →

Hi Bevi,

again, I take the liberty to disagree:

Am 3 Jan 2014 um 06:48 schrieb Chagnon | PubCom < = EMAIL ADDRESS REMOVED = >:

> Olaf wrote: "this is not how most people create documents... Then it's up to
> the tool that saves out the HTML or PDF or . to get the tags right... It's
> only a few developers who have to get the coding right, not the actual
> user."
>
> You're right. Most writers, editors, and other content creators aren't
> tagging their documents.

the tagging is done by the tool used to create the document! The only thing the author has to get right is to use style sheets and built-in features properly:
- heading styles for headings
- list feature for lists
- table feature for tables
and so forth. Besides doing something right one has to do anyway (using the right style sheets doesn't cause any extra work, nor does using the list or table feature in a program like Word or InDesign), there are usually only two pieces of extra work to a document that one might skip otherwise:
- take care of alternate text for images, charts, diagrams etc.
- take care of the metadata (at least enter a good title in the metadata)
Everything else needs to be taken care of by the authoring tool.

And by the way - this is not just a minor aspect: unless tools get it right (or begin to get it right), we will not see widespread production and distribution of accessible documents (beyond document from federal agencies - how often do people read documents from federal agencies, compared to other document types? Aren't those other documents even much more important?), whether HTML, PDF or some other format. On top, putting low level tagging burdens on users is a dramatic waste of resources! Users should be doing intelligent things, and developers of tools should get their act together once, instead of users compensating for suboptimal features in tools millions of times, again and again.


> Because we don't have good enough tools

So what do you think is easier to fix: hundreds of developers, or hundreds of millions of users? Who has an incentive, or could be provided with one (like pressure, money, awareness, proudness, ...)?


Olaf

From: SaravanaMoorthy.P
Date: Fri, Jan 03 2014 5:11AM
Subject: Re: JWAS and special characters pronunciation
← Previous message | No next message

The <abbr> with title attribute will work, however as someone mentioned it's not the default behavior of JAWS and users have to change their settings.
Geri mentioned that the freedom scientific should make this checked by default(read out title) in JAWS; if this had checked then I hope it will read the link text and the title text too(which will frustrates the user; may be because of this reason freedom scientific made this option off by default). Further in some CMS, the title text will be the same as the link text(which will added automatically) and in this case screen reader will pronounce twice the same text.

Note: NVDA by default pronounce "link text" and the "title attribute"

Apart from the <abbr> and other comment floating over here, you can do this too....

<div> p110<span aria-hidden="true">&alpha;</span> <span style="position:absolute; left:-1000px; top;auto">alpha</span> </div>

Here visually it will be p110á, however " á " will be hidden for screen reader users and the off-screen text "alpha" will be pronounced; which is the combination of ARIA & old traditional off-screen technique. The limitation here is the user agents(browser and screen reader) should support ARIA :)
I haven't tested the above approach as I don't have access to screen readers at work. However I hope this will work or someone here can confirm.

Regards,
Saran.