tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sed(1) and LC_CTYPE



On Wed, Jul 26, 2023 at 12:19:39PM -0400, Mouse wrote:
> > $ export LC_CTYPE=fr_FR.ISO8859-15
> 
> > $ echo "éé" | sed 's/é/\é/g'
> > sed: 1: "s/é/\é/g": RE error: trailing backslash (\)
> 
> I agree that's broken.
> 
> > Since, to my knowledge, we do not support anything via iconv or
> > whatever, shouldn't we assume simply a string of bytes \`a la C, that
> > is:
> 
> Seems to me there's a deeper problem.  Even if something like iconv
> _were_ available, fr_FR.ISO8859-15 is a single-octet character set, so
> 
> > -	(void) setlocale(LC_ALL, "");
> > +	(void) setlocale(LC_ALL, "POSIX");
> 
> should, it seems to me, make no difference.  Am I misunderstanding?

Indeed - and it only does on architectures where char == signed char:

$ export LC_CTYPE=fr_FR.ISO8859-15
$ echo "éé" | sed 's/é/\é/g'
sed: 1: "s/é/\é/g": RE error: trailing backslash (\)
$ uname -a
NetBSD seven-days-to-the-wolves.aprisoft.de 10.99.6 NetBSD 10.99.6 (GENERIC) #642: Wed Jul 26 10:32:32 CEST 2023  martin%seven-days-to-the-wolves.aprisoft.de@localhost:/work/src/sys/arch/amd64/compile/GENERIC amd64


but:

$ export LC_CTYPE=fr_FR.ISO8859-15
$ echo "éé" | sed 's/é/\é/g'
éé
$ uname -a
NetBSD big-apple.aprisoft.de 10.99.4 NetBSD 10.99.4 (POWERMAC_G5.MP) #100: Tue Jun 27 19:48:49 CEST 2023  martin%seven-days-to-the-wolves.aprisoft.de@localhost:/work/src/sys/arch/macppc/compile/POWERMAC_G5.MP macppc


Fun :-)

Thierry, can you please file a PR?

Martin


Home | Main Index | Thread Index | Old Index