[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Proposal: _ctype_ table bitwidth change
On Tue, Mar 22, 2011 at 03:30:27AM -0400, der Mouse wrote:
> > 0xa0 as Unicode Code Point is not representable as unsigned char with
> > UTF-8 encoding.
> That doesn't even make sense.
Yes, it does.
> UTF-8 takes Unicode codepoints and produces not octets but sequences of
> octets. The Unicode code point 0xa0 is representable, as a codepoint,
> as unsigned char; in this it is no different from any other integer in
> the range 0..255. It is representable as an octet sequence via
> encodings such as UTF-8. These two concepts should not be confused.
No, it isn't. There is no valid UTF-8 encoding of 0xA0 using a single
octet. Period. I haven't said anything else.
Main Index |
Thread Index |