Subject: Re: codeset recoding engine
To: None <firstname.lastname@example.org>
From: Erik Bertelsen <email@example.com>
Date: 11/14/1999 09:15:37
On Sun, Nov 14, 1999 at 08:51:30AM +0900, firstname.lastname@example.org wrote:
> >> > I think you need two conversons:
> >> > kernel: filesystem-charset to utf-8
> >> > then
> >> > userland: utf-8 to LC_CHARSET.
> Sorry I may not follow the discussion, but...
> Please don't ever, ever hardcode something to utf-8. There are
> character sets that contain characters that are not covered in utf-8.
> It is NOT universal.
Please be careful about the terminology: In my understanding, UTF-8 is -not-
a character code (character set), but an encoding of multibyte characters into
a sequence of bytes that are safely transmittable over a pure 7-bit ASCII
UTF-8 may be used to encode characters in several character codes (sets), e.g.
LATIN-1 and UNICODE. Note that even for LATIN-1, UTF-8 is not the identity mapping.
I also think (but am not 100% sure) that UTF-8 is able to encode full ISO 10646
characters if needed.