NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sending/receiving UTF-8 characters from terminal to program



Well, checking what printf results in, I get:

$printf 'néz'|hexdump -C
00000000  6e e9 7a                                          |n.z|
00000003
$printf $'n\uE9z'|hexdump -C
00000000  6e c3 a9 7a                                       |n..z|
00000004

It's definitely different from what you got for 'néz'. What does that mean?

Thanks,
r0ller

On 1/20/23 9:55 AM, RVP wrote:
On Fri, 20 Jan 2023, r0ller wrote:

Thanks for your efforts to reproduce it :) I just don't get why it works for you with the same locales and why it doesn't for me. Are there any other settings that affect encoding besides LC variables and LANG?


Since we seem to have the same flookup binary, check against the
magyar.fst I used:

https://github.com/r0ller/alice/tree/master/hi_android/foma

Next check that the input you're feeding to flookup actually _is_
UTF-8. Both /bin/sh and bash output UTF-8 if given Unicode code-
points in the form `\uNNNN'. So,

$ printf 'néz' | hexdump -C
00000000  6e c3 a9 7a                                       |n..z|
00000004
$ printf $'n\uE9z' | hexdump -C
00000000  6e c3 a9 7a                                       |n..z|
00000004
$

If that works, then check those UTF-8 bytes against whatever the
terminal emulator generated from your keystrokes for the `é'
in `néz'.

-RVP


Home | Main Index | Thread Index | Old Index