tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: _ctype_ table bitwidth change



> The most important point is that is* functions accept an octet, not a
> code point.

They do?  Where is this defined?

Historically, it has been false: is*() has been documented to accept
"characters", which I can't read as anything but codepoints.

That some charsets have some codepoints that can't fit in unsigned char
(at least when, as on NetBSD, unsigned char is just one octet) just
means that is*() aren't useful for more than just 256 of their possible
codepoints, not that they somehow get retconned to take just one octet
of a storage encoding of a codepoint.

At least, that's how I read it.  Is there a spec somewhere which spells
this out precisely?

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index