Subject: AWK vs. gawk
To: None <netbsd-help@netbsd.org>
From: Richard Rauch <rkr@olib.org>
List: netbsd-help
Date: 05/11/2004 19:05:36
Sorry for not replying directly.  I don't have direct replies on
my system (I never subscribe to these lists; (^&).  Thanks for the
replies, though.  (If you did send a direct reply, it may have
been postponed.  There was a prolonged outage to my mail server
today.)


To James K. Lowden:

I could not see a re_format(7) reference in the awk man-page on my
system.  Possibly if I update userland again, I'll get a more up to
date man page.  (The re_format(7) page, itself, is there.  It just
doesn't seem to be mentioend in the awk page.)

The ? is not listed as an EOL identifier.  ? is for 0-or-1 occurances
of the preceeding atom.  (So, "a?" would match "" or "a", but not "aa".)

The AWK man page does reference, e.g., egrep(1).  The egrep(1) page
covers the usual suspects, and then describes "basic" regular
expressions as a restricted subset.

From that I infer that AWK should support $ for end-of-string.



To Jukka Salmi:

Hm.  The / bounding of the expressions seems to be okay.  Maybe
I should go that way.  I used quotes because I was led to believe
that the r.e. should be a string.  (It looks like generally one
can use either quotes or slashes, though.)


E.g.,

$ echo 'Hello!' | awk '{print gensub("ello", "i", "g", $0)}'

...prints "Hi!" as I expect.


Re. the semicolon: Ooops.  That was a typo.  The real string being
matched is NOT a C function prototype, but rather a C function header:

/*!
    ...doxygen commentary...
*/
void func (
  void (*fptr) (void)
)
{
    ...function body...
}


The AWK program is transforming the doxygen commentary & function header
into a *roff style man-page.  The code works by first collecting lines
of data and classifying them (it doesn't really grok C, but for the case
in hand, the code is in a style where that's not really necessary).
After it's collected the lines in suitable groups, merging some and
putting others in AWK associative arrays (indexed by number, perversely;
(^&), it then does a cascade of operations---mostly gensbu() calls---on
the data to transform it, and then plugs suitable classes of data into
*roff lines.  So it catenates the header above into:

void func (  void (*fptr) (void))

...and then tries to strip off the "void func (  " prefix and ")" suffix.

It does work.  It just currently requires gawk to run.

-- 
  "I probably don't know what I'm talking about."  http://www.olib.org/~rkr/