NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
bin/54424: awk: broken character classes in UTF-8 locale: only the first matches
>Number: 54424
>Category: bin
>Synopsis: awk: broken character classes in UTF-8 locale: only the first matches
>Confidential: no
>Severity: serious
>Priority: high
>Responsible: bin-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jul 31 23:05:00 +0000 2019
>Originator: Martijn Dekker
>Release: 9.0_BETA
>Organization:
modernish
>Environment:
NetBSD localhost 9.0_BETA NetBSD 9.0_BETA (GENERIC) #0: Tue Jul 30 16:52:10 UTC 2019 mkrepro%mkrepro.NetBSD.org@localhost:/usr/src/sys/arch/amd64/compile/GENERIC amd64
>Description:
When a UTF-8 locale is active, /usr/bin/awk only matches the first character class in a bracket expression, even when matching simple ASCII characters.
I've confirmed this on NetBSD 8.1 as well. I've not tested earlier versions.
/usr/bin/awk on OpenBSD, FreeBSD and macOS (also nawk variants) do not have this problem, nor does the current upstream version (20190717).
>How-To-Repeat:
$ echo x | LANG=C awk '/[[:digit:][:alpha:]]/' # ok
x
$ echo x | LANG=en_US.UTF-8 awk '/[[:digit:][:alpha:]]/' # WRONG
$ echo x | LANG=en_US.UTF-8 awk '/[[:alpha:][:digit:]]/' # ok
x
>Fix:
Home |
Main Index |
Thread Index |
Old Index