Subject: AWK vs. gawk.
To: None <netbsd-help@netbsd.org>
From: Richard Rauch <rkr@olib.org>
List: netbsd-help
Date: 05/11/2004 05:30:50
--a8Wt8u1KmwUX3Y2C
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Right around the time that NetBSD -current switched from using gawk to
using nawk (I think) as the system AWK, I had written a smallish
AWK script to parse Doxygen documentation and spit out man pages.
(Doxygen can generate *roff, but it's not really usable for man
pages.  (^&)

A large part of the AWK script consists of gensub() calls.

Shortly after I wrote it, I updated one of my two -current
machines (more or less -current; (^&) and discovered that
gawk was replaced with another AWK.  I found this out when
my script stopped working.  (^&

I installed gawk from pkgsrc as an interrim solution, but
I'd like the script to work with the native AWK.


I am particularly bothered by a missing feature from the NetBSD
AWK, in the regular expressions: I need to anchor some of my
matches to beginnings or ends of strings.  For example, one of
my gensub calls is:

  ret =3D gensub ("^[ ]*\\([ ]*", "", "g", ret );

(I use the [brackets] as a habit since I often want space-or-TAB,
though here it's just spaces.)

What this does is strip leading blanks, followed by the open-paren,
of a function argument list.

This should work without the ^, on the principle of matching the
first thing it finds.  But then, later, I want to strip the *trailing*
close-paren and blanks:

  ret =3D gensub ("[ ]*\\)[ ]*$", "", "g", ret );

=2E..which should actually require the $ to be there.  (E.g.,
consider:

    void func (void (*fptr) (void));

=2E..)


As near as I can tell, the NetBSD awk does not recognize
^ or $ in regular expressions.  Is there a way to enable
that, is it supposed to be there and just not working due
to a bug (maybe I should update my userland?)?  Or am I
just out of luck?

It appears that it should be available.  (The awk page
references egrep, which implies that $ and ^ are part
of the basic regular expression synax.)


--=20
  "I probably don't know what I'm talking about."  http://www.olib.org/~rkr/

--a8Wt8u1KmwUX3Y2C
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFAoKtaT0R9S8K3/JQRArFOAJ4ox915hQoN3fi2CHrvQ1so+rcN6ACaAktO
VOGksADJjXmJQ9K9femHWrk=
=2n6a
-----END PGP SIGNATURE-----

--a8Wt8u1KmwUX3Y2C--