NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: bin/59803: sed(1) conditional branch command confuses subsequent line addressing
The following reply was made to PR bin/59803; it has been noted by GNATS.
From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: RVP <rvp%SDF.ORG@localhost>
Cc: gnats-bugs%netbsd.org@localhost
Subject: Re: bin/59803: sed(1) conditional branch command confuses subsequent line addressing
Date: Mon, 01 Dec 2025 09:17:38 +0700
Date: Mon, 1 Dec 2025 00:10:12 +0000 (UTC)
From: RVP <rvp%SDF.ORG@localhost>
Message-ID: <71936a11-411a-9ac3-c4cf-e5aef68f3393%SDF.ORG@localhost>
| but, there's a major difference between these two: even in a stream,
| sed can _always_ retrieve the current line number
Sure, but that's not the point. The point is what a dual address
command means, and it isn't the same as what you're imagining.
It is easy to be seduced when the command that is to be executed is
something simple, like 'p' 'd' or 's', but it isn't always.
Consider the case where what is happening is that extracts from the
text are being accumulated in the hold space from a range of input
lines - when the first line of the range is encountered, things
are initialised (the hold space is cleared, or whatever is needed),
and when the final line is encountered, the hold space is used in
whatever fashion is intended.
If you never actually encounter the first line, the init is going to
be skipped, and if the commands are executed on following lines, what
will result will be a mess.
The same applies to the end line of the range - if that one isn't
encountered, the commands simply keep on being applied - nothing has
caused them to stop. I suspect that your two line patch didn't handle
that case, I also suspect that handling it would be a little more complex.
But if in the OP's example the "3,$" were instead "3,5" (with the input
containing more than 5 lines), and it happened to be that line 5 was the
one where the substitute occurred, and that 't' causes the dual address
command to be skipped - then what happens is that that range remains active
and will apply to lines 6 7 8 ... continuing until line 5 is actually processed
(which is unlikely in that scenario!)
It isn't generally difficult to write sed scripts that handle all this kind
of thing properly (which often means not using explicit line numbers, other
than perhaps 1 and $) provided that one understands how sed is defined to
work - and dual address commands are not defined to be "any line that happens
to be at or after the first address and at or before the second address",
they are "start (only) when the first address is found", and "stop when the
second address is found (and only then)".
Don't fall into the trap of "it just seems obvious that it should ..." and
change the behaviour of commands without a careful analysis of why they
are the way they are.
kre
Home |
Main Index |
Thread Index |
Old Index