tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: wide characters and i18n

>For anyone who's not interested in the gory details of this
>sort of stuff, please stop reading now.  It only gets uglier;
>the world is a complex place, my Japanese friends have even
>more objections to Unicode as "one size fits all" than I do
>which I won't attempt to explain here, even if I were sure I
>remembered them all.

You know, this sort of illustrates the problem I've always had with
I18N, which is: what the hell are you talking about?

I try to understand, I really do ... I've been trying to understand for
approximately 10 years now.  But every time I try to read something written
by someone who understands what is going on, I get lost, and I have never
really seen anyone explain the answers to some basic questions:

- How, exactly, are UTF-8 and Unicode related? 
- What exactly is a "code point"?
- What, exactly, do people mean by "normalization" in this context?
- How do locales interoperate with UTF-8/Unicode?
- And, most importantly: what do I, as a programmer, need to do to make
  my application work with all of the above?  I read the posted Plan 9 link,
  and I guess that in some cases I need to deal with "Runes" (if I was
  programming on Plan 9), but it's still not exactly clear.

I'm not saying anyone should feel obligated to answer these questions (but,
hey, if you have a good reference, I'd be glad to read it), but I'm trying
to illustrate the information gap that prevents some people from participating
in these discussions in a meaningful way.

I try to be a good international citizen, I really do ... but in a practical
sense it seems to be _so_ complicated that I basically just punt and end
up doing what I always do ... and it seems that as long as I'm 8-bit clean,
that makes me and most of the Europeans happy enough (although it tends
to piss off Japanese and Chinese users, and I'm sorry about that).


Home | Main Index | Thread Index | Old Index