netbsd-users: Re: Any effort putting all things into unicode?

Subject: Re: Any effort putting all things into unicode?
To: None <netbsd-users@netbsd.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: netbsd-users
Date: 09/12/2003 13:58:53

On Fri, Sep 12, 2003 at 07:26:26PM +0200, Matthias Buelow wrote:
> Zhang Weiwu writes:
> 
> >I just wonder why, many leading open unix-like OS (BSDs and linux) which
> >still have LOTS of trouble in I18N don't simply use unicode (say,
> >utf16le, as used in WinNT all versions) all the way? I've been working
> 
> Redhat Linux 9 does that (at least on some scale) and it caused so
> many problems that the first thing we did after installing it was
> to switch locale settings back to iso-8859-1.  Among the many problems
> were:
[list of problems snipped]

Another problem is that handling Unicode is *slow*.  At a minimum, you
have to do twice the work per glyph of input or output, and it is difficult
to avoid doing a large part of that extra work even for traditional 8-bit
character sets when you have built your C library etc. to support Unicode.

If your processor has "string" instructions, they operate on *bytestrings*,
not strings-of-symbols-of-arbitrary-length, so you can forget about using
them any more.  Yay.

I worked at a major email outsourcing company for about a year, several
years ago.  While I was there, we made a big push into the Asian market.
Running the same software, adapted for Unicode by some very bright people,
we could do only 80% of the throughput on our "international" servers
that we could do on our standard servers.  Just building the Perl interpreter
with Unicode support makes *non-Unicode* I/O at least 25% slower...