NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/51171: sed does not match newlines in regexps properly
The following reply was made to PR bin/51171; it has been noted by GNATS.
From: Jarle Greipsland <jarle%uninett.no@localhost>
To: gnats-bugs%NetBSD.org@localhost, kre%munnari.OZ.AU@localhost
Cc:
Subject: Re: bin/51171: sed does not match newlines in regexps properly
Date: Fri, 27 May 2016 12:05:18 +0200 (CEST)
Robert Elz <kre%munnari.OZ.AU@localhost> writes:
> From: jarle%uninett.no@localhost
> Message-ID: <20160527074000.50F8B7AABE%mollari.NetBSD.org@localhost>
> | -------- script.sed ---------
> | 1{h;d;}
> | 2{H;d;}
> | 3{H
> | x
> | # Pattern space: line1 \n line2 \n \line3 (without spaces)
> | # Now, delete the first character of line1 and line2
> | s/^[^\n]\([^\n]*\n\)[^\n]/\1/
> | }
> | -----------------------------
> |
> | On NetBSD 6, the command
> | (echo abc; echo def; echo ghi) | sed -f script.sed
> | will print:
> | bc
> | ef
> | ghi
> | which is what I would expect.
>
> If it does it is a bug the expression [^\n] matches a character
> that is neither a '\' nor an 'n' and has nothing at all to do with newlines.
> No escape characters work inside [] (though there a whole set of
> magic combinations that mean specific things).
You are right. I shall have to adjust my expectations. And
someone might want to adjust sed's behavior in NetBSD 6. And GNU
sed also, it would seem. Oh well. Lesson learned: don't rely on
the behavior of \n in brackets.
This problem report should probably be closed.
> As best I can tell (having looked for it for ages) there is no way in
> sed to match anything other than a newline. I resorted to s/\n/X/
> where X was a character I knew could not appear in the text (because
> earlier commands had removed all instances), followed by [^X] in the
> expression to do the work, followed by s/X/${nl}/ (${nl} is a literal
> newline. Truly ugly, but I believe the only way possible.
Or even uglier, one could try and do dummy \n->\n substituions
for positions where one does not wish a \n to match, and use
control flow to branch to the appropriate substitutions.
> The best solution I can think of is to add a new char class that contains
> just newline, say [:nl:] and then use [^[:nl:]] but no sed does anything
> like that that I am aware of.
That would have been nice, yes.
-jarle
--
we all hack on a broken subroutine, a broken subroutine, a broken subroutine...
-- Kenneth Stailey
Home |
Main Index |
Thread Index |
Old Index