Subject: Re: iconv(3) not working properly with euc-kr
To: None <junyoung@netbsd.org>
From: T.SHIOZAKI <tshiozak@netbsd.org>
List: tech-userlevel
Date: 09/25/2003 05:01:29
From: Bang Jun-Young <junyoung@netbsd.org>
Subject: iconv(3) not working properly with euc-kr
Date: Tue, 23 Sep 2003 17:55:06 +0900
Message-ID: <20030923085506.GA2611@krishna>

> This program converts an utf-8 string to an euc-kr string and prints
> it out. When I ran it on FreeBSD using GNU iconv, the result was
> correct:
> 
> $ ./iconvtest | hexdump -C
> 00000000  be c6 b8 b6 b5 b5 0a                              |.......|
> 00000007
> 
> OTOH, on NetBSD-current every character was (mis)converted to 0x3f:
> 
> sh-2.05b$ ./iconvtest | hexdump -C
> 00000000  3f 3f 3f bf 01 0a                                 |???...|
> 00000006

The conversion tables for KSC5601 are old, because of my mistake.
I made new tables and put them into
ftp://ftp.netbsd.org/pub/NetBSD/misc/tshiozak/misc/ksc/

how to install them:
  zcat KSC5601%UCS.src.bz2 | mkcsmapper > KSC5601%UCS.mps
  zcat UCS%KSC5601.src.bz2 | mkcsmapper > UCS%KSC5601.mps
  sudo install -c -m 0444 KSC5601%UCS.mps UCS%KSC5601.mps \
      /usr/share/i18n/csmapper/KS/
  

Because the master repository of NetBSD is now migrating,
I will commit them later.


> "bf 01 0a" is garbage left in unused space of the output buffer.

It is better to fix the program as:

-	utf8_strlen = strlen(utf8_str);
+	utf8_strlen = strlen(utf8_str)+1;

or

-	printf("%s\n", euckr_str);
+	printf("%.*s\n", (int)(sizeof(euckr_str) - euckr_strlen), euckr_str);


--
Takuya SHIOZAKI