NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sending/receiving UTF-8 characters from terminal to program



On Fri, Jan 20, 2023 at 15:09:44 +0100, r0ller wrote:

> Well, checking what printf results in, I get:
> 
> $printf 'n?z'|hexdump -C
> 00000000  6e e9 7a                                          |n.z|
> 00000003
> $printf $'n\uE9z'|hexdump -C
> 00000000  6e c3 a9 7a                                       |n..z|
> 00000004
> 
> It's definitely different from what you got for 'n?z'. What does
> that mean?

In the second example you specify \uE9 which is the unicode code point
for e with acute.  It is then uncondionally converted by printf to
UTF-8 (which is two bytes: 0xc3 0xa9) on output.

Your terminal input is in 8859-1 it seems.  0xe9 in the first example
is "LATIN SMALL LETTER E WITH ACUTE" that is unicode code point \u00E9
which is encoded in latin-1 as 0xE9.  So your terminal inserted 0xe9
when you pressed that key.  May be you need to specify -u8 option or
utf8 resource?  (I'm mostly using netbsd headless, so I haven't been
following the current status of utf8 support in X).

-uwe


Home | Main Index | Thread Index | Old Index