tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: TRE regex



On Wed, 8 Jun 2016 23:31:07 -0700
Alistair Crooks <agc%pkgsrc.org@localhost> wrote:

> On 6 June 2016 at 18:35, James K. Lowden <jklowden%schemamania.org@localhost>
> wrote:
> > Back in 2009, Matthias-Christian Ott ported Ville Laurikari's regex

> It was brought into base. The USE_LIBTRE definitions causes things to
> happen in libc if it's defined.

I'm not on current, but didn't think I was too far behind, given that
so many years had passed.  Thanks for the pointer and the work.  

> Strange things happen when you attempt to compile a basic regexp with
> an implementation expecting an extended regexp, to the point where
> build.sh would not complete.

Easy to imagine.  IMO, since basic regex has been "obsolete" since the
Late Bronze Age, it would be to move build.sh to use ERE, and (then)
make that the default in sed.  Either that, or "Having two kinds of REs
is a botch" is just whining.  

> I do have a partial fix for that - take a look at the recently-added
> regextend(3) in othersrc/external/bsd -
> but until I've finished bringing that into libc, tre-based regexps
> will have to wait.

[will have a look]

> > In case you are feeling complacent about NetBSD's regex, the awk
> > documentation relies on it, and falls short.  Awk claims to
> > implement regex per egrep(1) 

> The awk documentation describes Bell Labs egrep, for fairly obvious
> reasons. The egrep in NetBSD is from GNU grep.

I suspected as much, Alistair, thanks for confirming.  You see, I've
been using NetBSD for just 17 years, so I rely on the manual.  I realize
that sometimes one has to know the history and lore to understand how
things work, but I don't think that's a good thing.  

It seems awk regex has no NetBSD documentation, as many readers of
this list were doubtless aware. The reference to egrep has been obsolete
since nawk was adopted (NetBSD 2.0) because afaik we've used GNU grep
since well before then.  

At Christos's behest, I looked at external/historical/nawk/dist/b.c.
I don't see any way to convert it to use the Posix API, nor any
appetite to write a new awk.  I guess just documenting its regex
implementation is that best we can hope for.  

> https://swtch.com/~rsc/regexp/
> 
> Highly recommended.

Agreed, absolutely.  

--jkl


Home | Main Index | Thread Index | Old Index