Subject: Re: Repalcement for grep(1)
To: Jamie Howard <firstname.lastname@example.org>
From: Dag-Erling Smorgrav <email@example.com>
Date: 07/07/1999 21:31:10
Jamie Howard <firstname.lastname@example.org> writes:
> On Sun, 4 Jul 1999, Archie Cobbs wrote:
> > There are two special cases- of bracket expressions: the
> > bracket expressions `[[:<:]]' and `[[:>:]]' match the null
> > string at the beginning and end of a word respectively.
> > Perhaps this will help with -w?
> Yes, I received a patch from Simon Burge which implements this. It also
> beats using [^A-Za-z] and [A-Za-z$] as I was and GNU grep does.
No, because there are scripts out there (e.g. ports/Mk/bsd.port.mk)
which rely on this behaviour.
I suggest you explore the magic of the nmatch and pmatch arguments to
regexec() :) Specifically, the pattern matched a word if:
((pmatch.rm_so == 0 || !isalpha(line[pmatch.rm_so-1]))
&& (pmatch.rm_eo == len || !isalpha(line[pmatch.rm_eo])))
This is off the top of my head, from reading the man page: you'll have
to try it out to see if it works.
You might want to replace isalpha with something less restrictive,
such as isalnum(), or:
#define isword(x) (isalnum(x) || (x) == '_')
(judging from empirical observation, the latter corresponds to what
GNU grep does)
As for full-line matches (-x), simply check that
(pmatch.rm_so == 0 && pmatch.rm_eo == len)
This should save you from playing games with back-references.
(both code snippets assume that line points to a line of text from the
input and that len is the length of that line minus the newline)
Dag-Erling Smorgrav - email@example.com