Subject: Re: Question about NetBSDs implementation of sed.
To: None <netbsd-help@NetBSD.org>
From: Jukka Salmi <j+nbsd@2007.salmi.ch>
List: netbsd-help
Date: 03/14/2007 22:21:22
Hi,

Glen Johnson --> netbsd-help (2007-03-14 14:32:29 -0400):
> NetBSD-help,
> I am learning how to use sed and can't seem to figure out how to have
> sed stop on word boundaries.  Yes, I read both man sed and man 7
> re_format.  I am at a loss.  I went on line and googled for my
> solution.  From there I came up with some possibilities but nothing
> seems to work for NetBSDs sed.  Here they are:
> I made a small file to test with:
> j.txt
> ----
> This is junk
> This is the junkiest!
> This is prejunk
> 
> Trial #1:
> --------
> $ sed -E 's/junk\b/great/' j.txt
> This is junk
> This is the junkiest!
> This is prejunk
> 
> BUT, you say \b is in the GNU implementation.  Very true but I had to try.

\b matches a word boundary in Perl but not in sed.


> Trial #2
> --------
> ~/junk> sed -E 's/\<junk\>/great\!/' j.txt
> This is junk
> This is the junkiest!
> This is prejunk
> 
> The \< \> is even mentioned in NetBSDs grep, and ed man pages.  I
> thought this had to be it.  Still wrong answer.  What about the -E that
> isn't that supposed to enable extended regular expressions?

Yes, it is, but \< and \> are not treaded specially when used in both
basic and extended regexes with sed.


> So how does one use sed to search on word boundaries?

Using the definition of a word boundary from perlre(1), you could use
e.g.:

$ sed 's/\([^a-zA-Z0-9_]\)junk/\1great/' j.txt
This is great
This is the greatiest!
This is prejunk

In case you want to match `junk' at the beginning beginning of a line,
too, you'll either need another substitution command:

$ sed 's/\([^a-zA-Z0-9_]\)junk/\1great/;s/^junk/great/' j.txt

or to use an extended expression:

$ sed -E 's/(^|[^a-zA-Z0-9_])junk/\1great/' j.txt

However, quoting re_format(7):

	The syntax for word boundaries is incredibly ugly.

;-)


HTH, Jukka

-- 
bashian roulette:
$ ((RANDOM%6)) || rm -rf ~