Subject: bin/650: possible bug with regular expressions in awk
To: None <gnats-admin@sun-lamp.cs.berkeley.edu>
From: None <ram@cs.arizona.edu>
List: netbsd-bugs
Date: 12/20/1994 08:35:05
>Number:         650
>Category:       bin
>Synopsis:       backslash does not escape *
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people (Utility Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Dec 20 08:35:03 1994
>Originator:     
>Organization:
"	"
>Release:        
>Environment:
	
System: netbsd 1.0 i386


>Description:
	The sub command, when used in awk does not properly escape the
	meaning of *.
>How-To-Repeat:


BTW -- I "ported" send-pr to my solaris machine to send the email report. 
Sorry if I left something out.


	file bar has:
janis:ram {58} cat bar
MOVIE RATINGS REPORT

New  Distribution  Votes  Rank  Title
      .0.03020..      11   5.5  $
      .121100...      11   4.0  $1,000,000 Duck
      0011211000     142   5.0  'burbs, The
      0000122100     418   6.4  'Crocodile' Dundee
      .2.224....       5   4.6  After School
      ..2.24....       7   4.9  After the Fall of New York
      ...0522...      14   5.6  After the Fox
 *    ...242.2..       5   5.6  After the Rehearsal
janis:ram {59} 


Use this awk program to see the problem:

#!/usr/bin/awk -f

BEGIN {
  in_report=0
}

/MOVIE RATINGS REPORT/ {
  in_report=1
}

in_report==1 && /[0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.][0-9\.]/ {
  new_line = $0;
  sub( "^ \* ", " X ", new_line);
  print new_line
}


Running the following line "cat bar | filter_ratings" gives 
 X .0.03020..      11   5.5  $
 X .121100...      11   4.0  $1,000,000 Duck
 X 0011211000     142   5.0  'burbs, The
 X 0000122100     418   6.4  'Crocodile' Dundee
 X .2.224....       5   4.6  After School
 X ..2.24....       7   4.9  After the Fall of New York
 X ...0522...      14   5.6  After the Fox
 X *    ...242.2..       5   5.6  After the Rehearsal

Apparently the sub command is matching all leading white space. When the
backslash should force the regexp to match a space asterisk space.

>Fix:
	I have no idea.
>Audit-Trail:
>Unformatted: