Re: Chinese/Japanese/Korean names and their romanizations in aFrench article


From: Pierre
Date: Mar 31, 2008 2:40AM

Sorry for not responding recently, I was a bit busy... and thank you
so much for your advices, all! I wasn't expecting so interesting
conversation about this topic... :)

inline comments/response following.

2008/3/25, Jukka K. Korpela < <EMAIL REMOVED> >:
> It's surely advisable to include romanized text, since most people who
> know French don't understand CJK characters and would find them
> particularly difficult, if not alienating, since they bear no
> resemblance to characters that they know.

Exactly. Oddly enough, with Vietnamese, people don't feel the same,
since it looks like latin characters...

> On such grounds, I think you should used romanized forms in the primary
> text and include the CJK form inside parentheses, rather than vice
> versa. You should indicate the romanization system used, since there are
> several systems in use; when using pinyin, linking to
> http://www.pinyin.info might be a good idea (so that you won't need to
> explain the details yourself).

I can give an explanation of the romanized systems used in the "About
this Website" page, I guess. This is what I've done with the RSS
system already, and it seems to work quite well.

Putting the original form between parenthesis is ok for small
references (like the name of an author, that is most of the time 3 or
4 characters), but not really suitable for like, say, discography...
When you have to write down 10 years of music from a Taiwanese rock
singer, with very long titles in both Chinese characters and pinyin,
it's a pain to write all those inline...

That's basically for this kind of use I was thinking of writing down
the Chinese form first, with a "span title" containing the romanized
version, and maybe a translation in French between parenthesis...

> Consider using tone marks, too, in the romanization. They may help
> readers, and they should not be too distracting. The main problem with
> them is technical: if you use diacritic marks (on vowels) as tone marks,
> then some of the letter+diacritic combinations might not be present in
> the font in use, or some assistive software might be unable to process
> them properly.

Yes. We didn't have much problems yet, except some bold fonts that are
disabled for the Japanese "ō" for instance... It's not really a big
deal, since it's still displayed.

> The parentheses aren't really necessary for any formal reason, since the
> style of CJK characters distinguishes them from the text in Latin
> letters. However, parentheses carry the suggestion that the
> parenthesized text is, well, parenthetical, i.e. that it is usually not
> essential for understanding the main content. In a sense, they are
> promise: you may ignore whatever is inside parens, and you need not
> panic just because there are mysterious characters there.

Right, this is what parenthesis are for (that, and the smileys of course...!) :)

> None of them is adequate for a transcribed form or for an original form.
> There is no semantic HTML markup for such a purpose. Use <span> if you
> need to turn such text into an element for styling or other purposes.

That's what I thought, afterwards... I've always wanted to stick to
HTML tags, but as you say, those one are unsuitable for my current

> > and then
> > use the title and lang attributes to display the romanization and the
> > language it comes from.
> Forget it. The romanized form is far too important to be left to depend
> on browser features. It should be in the content, not hidden in
> attributes. And remember that lang selectors don't work on IE 6.

About the lang selector: I know it doesn't work in IE, but it's just
for the pure semantic use, and, if you set it in the CSS, it's a plus
for people using browsers such as Firefox and Opera (to display the
language after the link, for instance).

About the title attribute: you're probably right, unfortunately... but
I think it also depends on the length of the romanized text you want
to print out. As mentioned earlier, when it's just to write an author
or singer's name, it's ok, but when it comes to albums titles, it's
For instance:
愛上別人是快樂的事 (àishàng biérén shì kuàilè de shì) (Loving Others is a Happy Thing)
is definitely too long, I think!

> > I could display the romanized version between
> > brackets when the article is printed, and use it as a "tooltip" when
> > the article is read online.
> That's unnecessarily unsafe and complicated.

Why that? The CSS trick to print the URL in links or title element
work quite well with most of the browsers (I think even IE7
understands it...)...
But maybe you mean the tooltip thing is unsafe?

> > What would be the best method to use in order to display such names in
> > a French text and to keep "readability" thanks to the romanized
> > versions of the characters?
> For _readability_, you would use Latin letters only, but this might not
> be feasible. Is there a reason to present the original CJK form? What
> will users benefit from it? If there is some real gain, it probably
> means that the CJK text should be part of the content, inline. But
> sometimes you might "hide" it behind a link, like
> <a href="#wubai">Wu Bai</a>
> and you would have somewhere an element with id="wubai" that presents
> the CJK form and possibly also the complete romanization, with tone
> marks. Something like
> <p id="wubai">Wu Bai (Chinese: 伍佰; pinyin: Wǔ Bǎi; Taiwanese Minnan:
> Gō·-pah), born 14 January 1968) is the stage name of a rock singer from
> Taiwan, Wu Chun-lin (Chinese: 吳俊霖; pinyin: Wǔ Jǔnlín; Taiwanese: Ngô·
> Chùn-lîm).</p>

SPIP, the CMS I'm using, allows notes creation. At first I though
notes should be only used for parallel elements to the text (like
facts, details about something, etc.), but you're right, I may
consider using them to give the original form...

> > I heard about a ruby tag <http://www.w3.org/TR/1998/WD-ruby-19981221/>;
> > but it seems it's not implemented in any "classical" browsers
> > (Firefox, Opera, Internet Explorer) the way I'd like to use it...
> IE has a working, though limited, support to Ruby, and as others have
> remarked, Ruby has been designed to "degrade gracefully" on
> non-supporting browsers, provided that an author uses correct markup.
> But this isn't really a job for Ruby, for several reasons. To begin
> with, Ruby text is (on IE) by default very small, and although you can
> usually change this with CSS, what would you actually do? You don't want
> gross line spacing, do you?
> Ruby might be interesting for _some_ purposes even outside its original
> scope, but it's not for normal transcriptions.

That's right. The perfect use for Ruby comes with poems, haikus, or...
Karaoke! :)

Thank you very much for this interesting discussion. I'm gonna discuss
about what to do with the others editors, based on your thoughts.

Have a good day!

Pierre Equoy
