Subject: Re: wsfont encoding
To: None <soda@sra.co.jp>
From: Marcus Comstedt <marcus@idonex.se>
List: tech-kern
Date: 02/02/2001 22:04:04
(Moving this to tech-kern due to popular demand...)

>>>>> "Noriyuki" == Noriyuki Soda <soda@sra.co.jp> writes:

  Noriyuki> Only Big5 has most glyphs, but it cannot be used for Japanese and Korean.
  Noriyuki> Because certain glyphs in Big5 has different rendering image with
  Noriyuki> Japanese and Korean.
  Noriyuki> If you can understand CJK ideogram, please look at Codepoint 0xAABD
  Noriyuki> (Unicode 0x76f4) in Big5. The corresponding character is 0x443d 
  Noriyuki> (3630 in raw/column code) in JIS X0208, and it is 0x7241 in (8233 in
  Noriyuki> raw/column code) in KSC5061.

  Noriyuki> Thus, using CJK ideogram font for other country doesn't have sense.

That's why I said GB to Big5, since both can be used for Chinese.


  Noriyuki> For single byte encodings, only wscons userlevel codeset which should 
  Noriyuki> be supported by default is the same encoding with font encoding.

This is much too restrictive for the general case.  If you have for
example a 8859-15 codeset and a 8859-1 font, you should be able to see
the glyphs for the overlapping characters.  This was my initial
requirement.  So how will this be accomplished?


  Noriyuki> We should supply UTF-8 (new standard) and ISO-2022 (VT100 standard) 
  Noriyuki> too. But those don't have to be included in kernel by default 
  Noriyuki> (Some of those should be added in /etc/rc, though). Other encodings 
  Noriyuki> should be optional (put the table on userland, user configuration
  Noriyuki> is required).

Having support for different encodings as loadable modules is probably
a good idea.  But I'm still curios about how this user configuration
will work.  If I configure that I want to use ISO-2022 with (among
others) 8859-15 codeset and that I want to have 8859-4 fonts, where
will the translation tables/functions come from?


  Noriyuki> For multibyte encodings, things are completely different.
  Noriyuki> Because ISO-2022 and EUC are both encoding scheme, only certain 
  Noriyuki> configuration (private final character for ISO-2022, 4 integers
  Noriyuki> for Gx graphic plane setting for EUC) is needed, and no conversion 
  Noriyuki> table is needed in these case.

Final characters and EUC plane settings can probably be put in the
kernel as default, they are not so big.  This includes final
characters for single byte encodings as well.


  // Marcus