tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sed(1) and LC_CTYPE



On Wed, Jul 26, 2023 at 05:27:51PM +0000, Taylor R Campbell wrote:
> > Date: Wed, 26 Jul 2023 17:32:03 +0200
> > From: tlaronde%polynum.com@localhost
> > 
> > If setting LC_CTYPE to this:
> > 
> > $ export LC_CTYPE=fr_FR.ISO8859-15
> > 
> > and then:
> > 
> > $ echo "??" | sed 's/?/\é/g'
> > sed: 1: "s/?/\é/g": RE error: trailing backslash (\)
> > 
> > Where does the program manage to find a backslash i.e. 0134? While
> > '?' is 0351.
> 
> Exactly what bytes are passed as an argument to sed?  Can you write a
> program that will hexdump argv[1] and pass the same argument to it?
> 
> Next step, if that reveals the expected 0xe9, is to find exactly what
> string is passed to regcomp inside sed.

RVP has sent (attached to GNATS bin/57544)  a diff against
regex/regcomp.c (in one place, an int c getting and promoting a signed 
char obtained by GETNEXT()).

The diff is attached to bin/57544.

I will try to compile a fixed NetBSD 10.0_BETA libc to see if this
solves the problem.
-- 
        Thierry Laronde <tlaronde +AT+ polynum +dot+ com>
                     http://www.kergis.com/
                    http://kertex.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C


Home | Main Index | Thread Index | Old Index