tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: _ctype_ table bitwidth change



On Tue, Mar 22, 2011 at 08:41:04AM +0900, T.SHIOZAKI wrote:
> Here,
>   - 0xa0 is representable as an unsigned char, and
>   - 0xa0 is not a space character.

0xa0 as Unicode Code Point is not representable as unsigned char with
UTF-8 encoding. This is different from a wchar_t, which is an internal
encoding (in this case, most likely using UCS-2 or UCS-4).

> Thus, to conform to the standard, the behavior of isspace(0xa0) should
> be defined and it should return 0, even if 0xa0 is not a valid character.

It is not "even if". You are reversing cause and effect. 0xa0 is not a
valid space character, since it is not a valid character by itself.
That's why all functions but isascii should fail.

My argument is that we still need and want a separate table, but it can
and should have the same format as the full rune table. E.g. effectively
variant 1.

Joerg


Home | Main Index | Thread Index | Old Index