tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: _ctype_ table bitwidth change



At Wed, 23 Mar 2011 07:51:23 -0400 (EDT), der Mouse wrote:
> 
> That some charsets have some codepoints that can't fit in unsigned char
> (at least when, as on NetBSD, unsigned char is just one octet) just
> means that is*() aren't useful for more than just 256 of their possible
> codepoints, not that they somehow get retconned to take just one octet
> of a storage encoding of a codepoint.

NO-BREAK SPACE, which is 0xC2 0xA0 in en_US.UTF-8, obviously falls into
the some-codepoints-that-can't-fit-in-unsigned-char category.
No retcon here.

If you assume any internal representation of NO-BREAK SPACE
and you claim it fits in unisigned char based on that assumption,
then it seems to me that you are retconning.
After all, is*() do not care the internal representation.

Ken


Home | Main Index | Thread Index | Old Index