Subject: Re: bin/30294
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: John Darrow <John.P.Darrow@wheaton.edu>
List: netbsd-bugs
Date: 07/09/2005 20:14:02
The following reply was made to PR bin/30294; it has been noted by GNATS.

From: John Darrow <John.P.Darrow@wheaton.edu>
To: jdolecek@netbsd.org
Cc: gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
	John.P.Darrow@wheaton.edu
Subject: Re: bin/30294
Date: Sat, 9 Jul 2005 14:01:53 -0600

 On Sat, Jul 02, 2005 at 08:51:52PM +0000, jdolecek@netbsd.org wrote:
 > It behaves according to it's documentation - RS is described as
 > 'input record separator' countrary to e.g. FS 'regular expression used
 > to separate fields'. Apparently it's not supposed to be a RE, so
 > this doesn't seem to be a bug.
 
 I beg to differ:
 
 1. It's a feature regression.  A script written according to the man
 page of "the system awk shipped with 1.6.2 and earlier" no longer
 works with "the system awk shipped with 2.0 and later".  _IF_ this
 sort of feature regression is acceptable, it should be marked with
 BIG WARNINGS in the man page, and the 2.0+ awk should _at least_ print
 a warning (if not exit with an error) if a program attempts to assign
 more than one character to RS.
 
 2. The phrase "input record separator" does not provide any semantic
 information as to the format of such a separator.  I cannot find
 anywhere in the 2.0+ awk man page that specifies that RS (or ORS) is
 limited to a single character.  Given its position shortly after FS in
 the man page, it becomes a very reasonable assumption that the
 semantics are identical, simply omitted the second time to avoid
 longwindedness and redundancy in the man page.
 
 3. It violates the POLA that the similarly-named FS "Field Separator"
 and RS "Record Separator" would have such very different semantics.
 
 4. While RS being a regular expression may be considered an
 "extension" by purists, awk already implements other "extensions",
 such as causing every character to be a separate field if FS (or the
 third argument to the split function) is NULL.  (The 1.6.2/gawk man
 page explicitly explains that such behavior is an extension, and
 disables it with --traditional.)
 
 5. awk already implements special case handling for the null RS
 case (RS="").  Such special handling is not mentioned in the man page,
 though it _was_ in the 1.6.2/gawk man page.  If awk can special case
 for that non-single-character RS case, why shouldn't it also be able
 to handle other non-single-character RS cases, _especially_ when it
 already does so for FS?