NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sending/receiving UTF-8 characters from terminal to program



On Fri, Jan 20, 2023 at 11:26:49PM +0000, RVP wrote:
> On Fri, 20 Jan 2023, Valery Ushakov wrote:
> 
> > On Fri, Jan 20, 2023 at 15:09:44 +0100, r0ller wrote:
> > 
> > > Well, checking what printf results in, I get:
> > > 
> > > $printf 'n?z'|hexdump -C
> > > 00000000  6e e9 7a                                          |n.z|
> > > 00000003
> > > $printf $'n\uE9z'|hexdump -C
> > > 00000000  6e c3 a9 7a                                       |n..z|
> > > 00000004
> > > 
> > > It's definitely different from what you got for 'n?z'. What does
> > > that mean?
> > 
> > In the second example you specify \uE9 which is the unicode code point
> > for e with acute.  It is then uncondionally converted by printf to
> > UTF-8 (which is two bytes: 0xc3 0xa9) on output.
> > 
> > Your terminal input is in 8859-1 it seems.
> > 
> 
> That's it. The terminal emulator is not generating UTF-8 from the keyboard
> input.
> 
> > May be you need to specify -u8 option or utf8 resource?
> > 
> 
> That would work. So would running uxterm instead of xterm, but, all of
> these mess-up command-line editing: Alt+key is converted into a char.
> code instead of an ESC+key sequence.

perhaps you're referring to eightBitInput (see manpage)

> R0ller, do this:
> 
> 1. Add your locale settings in ~/.xinitrc (or ~/.xsession if using xdm):
> 
> export LANG=hu_HU.UTF-8
> export LC_CTYPE=hu_HU.UTF-8
> export LC_MESSAGES=hu_HU.UTF-8

R0ller wasn't clear about whether this was done (outside the terminal).

Actually, R0ller didn't mention whether the terminal was the graphical
environment or the console (from the comments, I assumed the latter).
 
> 2. In ~/.Xresources, tell xterm to use the current locale when generating
>    chars.:
> 
> XTerm*locale: true

that's redundant, since the default "medium" will give the same effect :-)
 
>    The `-lc' option does the same thing. If using uxterm, the class-name
>    becomes `UXTerm'.
> 
> 
> 
> 
> On Fri, 20 Jan 2023, Robert Elz wrote:
> 
> > I believe bash will take your current locale into account
> > when doing that [...]
> > 
> 
> That's correct. But as r0ller had a UTF-8 locale set, I didn't mention that.
> However, it is better to be precise, so thank you!
> 
> -RVP

-- 
Thomas E. Dickey <dickey%invisible-island.net@localhost>
https://invisible-island.net

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index