lib/44603: editline el_gets drops many UTF-8 characters

To: lib-bug-people%netbsd.org@localhost,gnats-admin%netbsd.org@localhost,netbsd-bugs%netbsd.org@localhost
Subject: lib/44603: editline el_gets drops many UTF-8 characters
From: steve.vernon%citrix.com@localhost
Date: Fri, 18 Feb 2011 23:05:00 +0000 (UTC)

>Number:         44603
>Category:       lib
>Synopsis:       editline el_gets drops many UTF-8 characters
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Feb 18 23:05:00 +0000 2011
>Originator:     Steven Vernon
>Release:        sources as of 2011/02/04
>Organization:
Citrix
>Environment:
>Description:
When using el_gets() in editline, which is called by the readline emulation 
function readline(), multi-byte characters are always dropped. This is 
incorrect for UTF-8 because many UTF-8 characters are multi-byte (all non-ASCII 
characters).
>How-To-Repeat:
Use either el_gets() or readline() when compiled for UTF-8 (build with 
WIDECHAR, which is the default) and set the local to some UTF-8 variant, such 
as en_US.UTF-8 (e.g. set the environment variable LC_ALL to this).
>Fix:
el_gets() unconditionally sets IGNORE_EXTCHARS before calling el_wgets() (and 
then resets it after the call). This causes read_char() to drop multi-byte 
characters.

There are 2 possible solutions:
1) Only set IGNORE_EXTCHARS if CHARSET_IS_UTF8 is not set (and don't unset it 
after the call to el_wgets()), as is done in el_getc().
2) Have read_char() not honor IGNORE_EXTCHARS if CHARSET_IS_UTF8. Ofhand, this 
seems like the better, more correct solution, but it could affect more paths 
through the code. If you do this you should probably remove the code from 
el_getc() to conditinally set and unset IGNORE_EXTCHARS.

More testing on UTF-8 should be done.

Prev by Date: Re: kern/44418 (FAST_IPSEC and if_wm kernel panic - may affect the whole network stack)
Next by Date: Re: kern/44418 (FAST_IPSEC and if_wm kernel panic - may affect the whole network stack)
Previous by Thread: PR/36864 CVS commit: src/sys/opencrypto
Next by Thread: xsrc/44607: Touchscreen Drivers loosing Button Release events
Indexes:

Home | Main Index | Thread Index | Old Index