Subject: Re: iconv(3) not working properly with euc-kr
To: None <junyoung@netbsd.org>
From: T.SHIOZAKI <tshiozak@netbsd.org>
List: tech-userlevel
Date: 09/25/2003 05:01:29
From: Bang Jun-Young <junyoung@netbsd.org>
Subject: iconv(3) not working properly with euc-kr
Date: Tue, 23 Sep 2003 17:55:06 +0900
Message-ID: <20030923085506.GA2611@krishna>
> This program converts an utf-8 string to an euc-kr string and prints
> it out. When I ran it on FreeBSD using GNU iconv, the result was
> correct:
>
> $ ./iconvtest | hexdump -C
> 00000000 be c6 b8 b6 b5 b5 0a |.......|
> 00000007
>
> OTOH, on NetBSD-current every character was (mis)converted to 0x3f:
>
> sh-2.05b$ ./iconvtest | hexdump -C
> 00000000 3f 3f 3f bf 01 0a |???...|
> 00000006
The conversion tables for KSC5601 are old, because of my mistake.
I made new tables and put them into
ftp://ftp.netbsd.org/pub/NetBSD/misc/tshiozak/misc/ksc/
how to install them:
zcat KSC5601%UCS.src.bz2 | mkcsmapper > KSC5601%UCS.mps
zcat UCS%KSC5601.src.bz2 | mkcsmapper > UCS%KSC5601.mps
sudo install -c -m 0444 KSC5601%UCS.mps UCS%KSC5601.mps \
/usr/share/i18n/csmapper/KS/
Because the master repository of NetBSD is now migrating,
I will commit them later.
> "bf 01 0a" is garbage left in unused space of the output buffer.
It is better to fix the program as:
- utf8_strlen = strlen(utf8_str);
+ utf8_strlen = strlen(utf8_str)+1;
or
- printf("%s\n", euckr_str);
+ printf("%.*s\n", (int)(sizeof(euckr_str) - euckr_strlen), euckr_str);
--
Takuya SHIOZAKI