tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Proposal: _ctype_ table bitwidth change



At Tue, 22 Mar 2011 10:18:52 +0200, Alan Barrett wrote:
> On Tue, 22 Mar 2011, Takehiko NOZAKI wrote:
> >{
> >        setlocale(LC_ALL, "en_US.UTF-8");
> >        printf("isspace:%d\n", isspace((unsigned char)0xA0));
> >        printf("iswspace:%d\n", iswspace((wchar_t)0xA0));
> >}
> >
> >this code print:
> >
> >isspace:0
> >iswspace:1
> 
> If you want to say "give me the wchar_t value that corresponds 
> to Unicode character U+00A0" then I think you have to use some 
> combination of iconv and mbtowc; as far as I know, you are not 
> allowed to assume anything about the underlying representation of 
> wchar_t, so you can't just cast.

You are right, but you are missing the point.

For the problem discussed in this thread, it is not important how you
get the value 0xA0 (so Nozaki-san shortcut to simplify the example).
Nozaki-san only showed that there are some values (0xA0 here)
for which is*() and isw*() must return different results.

The point here is that we cannot share the *contents* of the tables,
as illustrated in the above example.
# FWIW, I have no opinion whether the *formats* of the tables should
# be unified or not, because I have not read relevant codes...

Regards,
Ken


Home | Main Index | Thread Index | Old Index