Subject: Re: iconv(3) not working properly with euc-kr
To: T.SHIOZAKI <tshiozak@netbsd.org>
From: Bang Jun-Young <junyoung@netbsd.org>
List: tech-userlevel
Date: 09/25/2003 11:25:16
On Thu, Sep 25, 2003 at 05:01:29AM +0900, T.SHIOZAKI wrote:
> 
> From: Bang Jun-Young <junyoung@netbsd.org>
> Subject: iconv(3) not working properly with euc-kr
> Date: Tue, 23 Sep 2003 17:55:06 +0900
> Message-ID: <20030923085506.GA2611@krishna>
> 
> > This program converts an utf-8 string to an euc-kr string and prints
> > it out. When I ran it on FreeBSD using GNU iconv, the result was
> > correct:
> > 
> > $ ./iconvtest | hexdump -C
> > 00000000  be c6 b8 b6 b5 b5 0a                              |.......|
> > 00000007
> > 
> > OTOH, on NetBSD-current every character was (mis)converted to 0x3f:
> > 
> > sh-2.05b$ ./iconvtest | hexdump -C
> > 00000000  3f 3f 3f bf 01 0a                                 |???...|
> > 00000006
> 
> The conversion tables for KSC5601 are old, because of my mistake.
> I made new tables and put them into
> ftp://ftp.netbsd.org/pub/NetBSD/misc/tshiozak/misc/ksc/
> 
> how to install them:
>   zcat KSC5601%UCS.src.bz2 | mkcsmapper > KSC5601%UCS.mps
>   zcat UCS%KSC5601.src.bz2 | mkcsmapper > UCS%KSC5601.mps
>   sudo install -c -m 0444 KSC5601%UCS.mps UCS%KSC5601.mps \
>       /usr/share/i18n/csmapper/KS/
>   
> 
> Because the master repository of NetBSD is now migrating,
> I will commit them later.

It worked like a champ! Thanks for fixing it. :-)

> 
> 
> > "bf 01 0a" is garbage left in unused space of the output buffer.
> 
> It is better to fix the program as:
> 
> -	utf8_strlen = strlen(utf8_str);
> +	utf8_strlen = strlen(utf8_str)+1;
> 
> or
> 
> -	printf("%s\n", euckr_str);
> +	printf("%.*s\n", (int)(sizeof(euckr_str) - euckr_strlen), euckr_str);

That's because converted data was assumed to be always correct.
I found the assumption wrong. :-)

Jun-Young

-- 
Bang Jun-Young <junyoung@NetBSD.org>