Subject: Re: Japanese with wscons?
To: Ignatios Souvatzis <is@beverly.kleinbus.org>
From: None <itojun@iijlab.net>
List: tech-kern
Date: 01/12/2001 19:12:50
>> 	i don't think escape sequence works here (at least you cannot go out
>> 	from UTF8 using escape sequences, I believe)  maybe a sysctl to switch
>> 	encodings?
>Maybe I misunderstand, but how would you display a multilanguage document on
>the screen with this?

	i was talking about switching between different encodings.  i'll
	present sequence of bytes for couple of modes.

	if you are in euc-jp encoding mode, characters with 8th bit set =
	japanese.  you can only mix ASCII and couple of japanese character sets.
		ABCD\301\302\303\304
		    ~~~~~~~~~~~~~~~~
		    two japanese letters

	if you are in euc-kr encoding mode, characters with 8th bit set =
	japanese.  you can only mix ASCII and couple of japanese character sets.
		ABCD\301\302\303\304
		    ~~~~~~~~~~~~~~~~
		    two korean letters

	if you are in X11 ctext encoding mode, the following sequences
	should present some Japanese, Chinese, Korean text.  you can mix them
	just fine.  also you can mix iso-8859-x characters just fine.
		\033$(Bblabla\033$\033$(Gblabla\033$(Cblabla\033
		       ~~~~~~            ~~~~~~	      ~~~~~~
		       3 japanese	3 chinese	3 korean
		       letters		letters		letters
		\033,Ablabla\033(Bblabla
		      ~~~~~~	  ~~~~~~
		      iso-8859-1  ascii

	with iso-2022-jp-2 mode the sequence is almost identical.
		\033(Bblabla\033$\033$(Gblabla\033$(Cblabla\033
		      ~~~~~~            ~~~~~~	      ~~~~~~
		       3 japanese	3 chinese	3 korean
		       letters		letters		letters
		\033,Ablabla\033(Bblabla
		      ~~~~~~	  ~~~~~~
		      iso-8859-1  ascii

	above four examples are all iso-2022 variants.  iso-2022 is like
	a framework.  iso-2022-jp, euc-kr, euc-jp, X11 ctext are instance in
	the iso-2022 framework.

	if you are in UTF8 mode, you can send UTF8 stream and print some text.
	however, due to han unification, some of Japanese/Chinese/Taiwanese
	characters will be presented in wrong glyph.  the problem do not
	exist for other iso-2022 variants.
	(i don't have any example here)

itojun