NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sending/receiving UTF-8 characters from terminal to program



On Thu, 19 Jan 2023, Rhialto wrote:

I think there is some encoding confusion going on here. I still use
boring old Latin-1 (iso 8859-1), and I saw the two occurrences of the
word n<something>z differently.

echo néz|flookup magyar.fst
echo néz|flookup magyar.fst

The first case with 1 letter in the middle looking like an e + aigu,
the second time as 2 characters, probably an utf-8 encoding.


Yeah, some kind of encoding mismatch is responsible. Everything worked
for me because

a) all the text I copy-pasted were UTF8 (even the 2nd one, which,
   unsurprisingly, didn't work.)

b) flookup was OK with UTF-8 input (or my converted UTF-8 input to match
   the text encoding in magyar.fst)

c) text in magyar.fst was in UTF-8/Unicode (or, if another encoding, then
   flookup did the conversion before doing the text lookup.)

b) and c) are educated guesses.

-RVP


Home | Main Index | Thread Index | Old Index