Subject: bin/30294: nawk doesn't handle RS as a RE but as a single character
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <John.P.Darrow@wheaton.edu>
List: netbsd-bugs
Date: 05/21/2005 00:28:00
>Number:         30294
>Category:       bin
>Synopsis:       nawk doesn't handle RS as a RE but as a single character
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 21 00:28:00 +0000 2005
>Originator:     John Darrow
>Release:        NetBSD 2.0_STABLE and beyond
>Organization:
	Wheaton College Computing Services
>Environment:
System: NetBSD jdarrowpiii1g.wheaton.edu 2.0_STABLE NetBSD 2.0_STABLE (JDARROW) #1: Fri Mar 11 05:20:17 CST 2005 jdarrow@haran.wheaton.edu:/var/obj.release-2/sys/arch/i386/compile/JDARROW i386
Architecture: i386
Machine: i386
4505671 288 -r-xr-xr-x  1 root  wheel  133550 Feb 12 07:27 /usr/bin/awk*
/usr/bin/awk: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for NetBSD 2.0, dynamically linked (uses shared libs), not stripped

>Description:
	nawk allows a complete string to be put into the RS variable,
	but does not treat that string as a regular expression for
	record splitting purposes - instead, it splits only on the
	first character of the string.
>How-To-Repeat:
Enter the following line of data:

	1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 5.5.5.5

into the following code fragment, which worked under 1.6.2's awk (gawk):

	awk ' BEGIN { FS="."; RS="[[:space:]]"; } { x=10000+(256*(256*(256*$1+$2)+$3)+$4)%2000000000; printf "%d deny ip host %d.%d.%d.%d any\n",x,$1,$2,$3,$4; } '

Only one line is returned:

	16853009 deny ip host 1.1.1.1 any

However, put in the odd-looking line:

	1.1.1.1[2.2.2.2[3.3.3.3[4.4.4.4[5.5.5.5

and get back five results, the five I _wanted_ from the spaced line:

	16853009 deny ip host 1.1.1.1 any
	33696018 deny ip host 2.2.2.2 any
	50539027 deny ip host 3.3.3.3 any
	67382036 deny ip host 4.4.4.4 any
	84225045 deny ip host 5.5.5.5 any

(The same thing happens if I use "[ \t\n]+" instead of the POSIX
character class inside brackets "[[:space:]]".)

>Fix:
	Not known.  Workaround: revert to gawk.

>Unformatted:
 (code changes between netbsd-2 and -current appear to be minimal, so
 this bug would still be present as of 2005-05-20 19:57 UTC)