Re: wide characters and i18n

To: tech-userlevel%netbsd.org@localhost
Subject: Re: wide characters and i18n
From: Joerg Sonnenberger <joerg%britannica.bec.de@localhost>
Date: Sun, 11 Jul 2010 00:42:28 +0200

On Sat, Jul 10, 2010 at 11:15:10PM +0100, Sad Clouds wrote:
> I'm not sure how portable it is to assume that input character data is
> in UTF-8 format. Some articles suggest to let the user set locale
> environment variables and let C library routines perform the correct
> conversion from multi-byte to wchar_t characters. This should be
> MT-safe with restartable multi-byte functions, as long as setlocale()
> is not called. This basically binds you to one locale at run time.

Depending on your environment, the UTF8 assumption is questionable.
In many European countries, either one of the ISO-8859 charsets or
Unicode (UTF-8 or UTF-16) is used. IIRC China tends to use its own
character set a lot too.

You are correct about the setlocale() issue. There have been discussions
about supporting multiple locales at the same time, but nothing
implemented (yet).

> If you need to convert character encodings which are different from the
> current locale, then I guess the only option is to use something like
> iconv or custom conversion functions...

Use iconv. It is part of SUS and has a portable implementation with
libiconv for systems that (still) don't provide it natively.

Joerg

References:
- wide characters and i18n
  - From: Sad Clouds
- Re: wide characters and i18n
  - From: Matthew Mondor
- Re: wide characters and i18n
  - From: Sad Clouds

Prev by Date: Re: wide characters and i18n
Next by Date: Re: wide characters and i18n
Previous by Thread: Re: wide characters and i18n
Next by Thread: Re: wide characters and i18n
Indexes:

Home | Main Index | Thread Index | Old Index