Subject: Re: iconv(3) not working properly with euc-kr
To: T.SHIOZAKI <tshiozak@netbsd.org>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-userlevel
Date: 09/25/2003 11:25:16
On Thu, Sep 25, 2003 at 05:01:29AM +0900, T.SHIOZAKI wrote:
>
> From: Bang Jun-Young <junyoung@netbsd.org>
> Subject: iconv(3) not working properly with euc-kr
> Date: Tue, 23 Sep 2003 17:55:06 +0900
> Message-ID: <20030923085506.GA2611@krishna>
>
> > This program converts an utf-8 string to an euc-kr string and prints
> > it out. When I ran it on FreeBSD using GNU iconv, the result was
> > correct:
> >
> > $ ./iconvtest | hexdump -C
> > 00000000 be c6 b8 b6 b5 b5 0a |.......|
> > 00000007
> >
> > OTOH, on NetBSD-current every character was (mis)converted to 0x3f:
> >
> > sh-2.05b$ ./iconvtest | hexdump -C
> > 00000000 3f 3f 3f bf 01 0a |???...|
> > 00000006
>
> The conversion tables for KSC5601 are old, because of my mistake.
> I made new tables and put them into
> ftp://ftp.netbsd.org/pub/NetBSD/misc/tshiozak/misc/ksc/
>
> how to install them:
> zcat KSC5601%UCS.src.bz2 | mkcsmapper > KSC5601%UCS.mps
> zcat UCS%KSC5601.src.bz2 | mkcsmapper > UCS%KSC5601.mps
> sudo install -c -m 0444 KSC5601%UCS.mps UCS%KSC5601.mps \
> /usr/share/i18n/csmapper/KS/
>
>
> Because the master repository of NetBSD is now migrating,
> I will commit them later.
It worked like a champ! Thanks for fixing it. :-)
>
>
> > "bf 01 0a" is garbage left in unused space of the output buffer.
>
> It is better to fix the program as:
>
> - utf8_strlen = strlen(utf8_str);
> + utf8_strlen = strlen(utf8_str)+1;
>
> or
>
> - printf("%s\n", euckr_str);
> + printf("%.*s\n", (int)(sizeof(euckr_str) - euckr_strlen), euckr_str);
That's because converted data was assumed to be always correct.
I found the assumption wrong. :-)
Jun-Young
--
Bang Jun-Young <junyoung@NetBSD.org>