Subject: Re: Question about NetBSDs implementation of sed.
To: None <netbsd-help@NetBSD.org>
From: Jukka Salmi <j+nbsd@2007.salmi.ch>
List: netbsd-help
Date: 03/14/2007 22:21:22
Hi,
Glen Johnson --> netbsd-help (2007-03-14 14:32:29 -0400):
> NetBSD-help,
> I am learning how to use sed and can't seem to figure out how to have
> sed stop on word boundaries. Yes, I read both man sed and man 7
> re_format. I am at a loss. I went on line and googled for my
> solution. From there I came up with some possibilities but nothing
> seems to work for NetBSDs sed. Here they are:
> I made a small file to test with:
> j.txt
> ----
> This is junk
> This is the junkiest!
> This is prejunk
>
> Trial #1:
> --------
> $ sed -E 's/junk\b/great/' j.txt
> This is junk
> This is the junkiest!
> This is prejunk
>
> BUT, you say \b is in the GNU implementation. Very true but I had to try.
\b matches a word boundary in Perl but not in sed.
> Trial #2
> --------
> ~/junk> sed -E 's/\<junk\>/great\!/' j.txt
> This is junk
> This is the junkiest!
> This is prejunk
>
> The \< \> is even mentioned in NetBSDs grep, and ed man pages. I
> thought this had to be it. Still wrong answer. What about the -E that
> isn't that supposed to enable extended regular expressions?
Yes, it is, but \< and \> are not treaded specially when used in both
basic and extended regexes with sed.
> So how does one use sed to search on word boundaries?
Using the definition of a word boundary from perlre(1), you could use
e.g.:
$ sed 's/\([^a-zA-Z0-9_]\)junk/\1great/' j.txt
This is great
This is the greatiest!
This is prejunk
In case you want to match `junk' at the beginning beginning of a line,
too, you'll either need another substitution command:
$ sed 's/\([^a-zA-Z0-9_]\)junk/\1great/;s/^junk/great/' j.txt
or to use an extended expression:
$ sed -E 's/(^|[^a-zA-Z0-9_])junk/\1great/' j.txt
However, quoting re_format(7):
The syntax for word boundaries is incredibly ugly.
;-)
HTH, Jukka
--
bashian roulette:
$ ((RANDOM%6)) || rm -rf ~