tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Next steps for /bin/sh



On Fri, Mar 11, 2016 at 07:53:16AM +0700, Robert Elz wrote:
> 
> One of those changes was to stop using the shell's own private isalpha()
> macros (they have different names - and "stop using" meant to redefine them
> in terms of <ctype.h> and isalpha() etc.)
> 
> In 2010, FreeBSD undid that change, with a commit log entry that
> reads ...
> 
>    sh: Do not use locale for determining if something is a name.
> 
>    This makes it impossible to use locale-specific characters in variable
>    names.
> 
>    Names containing locale-specific characters make scripts only work with the
>    correct locale setting. Also, they did not even work in many practical cases
>    because multibyte character sets such as utf-8 are not supported.
> 
>    This also avoids weirdness if LC_CTYPE is changed in the middle of a script.

There is a lot of code out there that assumes that isdigit() checks
for exactly the characters "0123456789" (and will assume they are
consequetive). The original ctype functions made this test cheap,
but I'm not at all certain whether the locale-specific functions
are either fast or do what the writer intended.

The same is (probably) true when using isalpha() to check for [a-zA-Z]
for variable names. You typically don't want to allow locale-specific
characters. Probably even the ones in the 8859-x character sets.

isxxx() functions that test the C locale would be sane in many places.

	David

-- 
David Laight: david%l8s.co.uk@localhost


Home | Main Index | Thread Index | Old Index