tech-misc archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: wchar_t encoding?



Paul Koning wrote:-

> As for jis/kuten, that's what Neil mentioned.  I know next to nothing
> about this but from what I read on Wikipedia it appears that JIS-0208 is
> a subset of Unicode.  So I'm puzzled why jis/kuten would be used as the
> wchar_t encoding. 

Because it's a very fast conversion; the single byte form just
encodes the ku / ten (row / column) with some bitshifts etc;
converting to other kuten-based encodings (Big5 etc) is then also
very simple.  But there is no simple mapping from kuten to Unicode,
I think you'd need an 8000-entry table (and one to map back).  I
also believe there are a few codepoints that don't have a one-to-one
mapping to Unicode, so a roundtrip conversion isn't guaranteed.

Neil.


Home | Main Index | Thread Index | Old Index