tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: shell prefix/suffix removal with quoted word



    Date:        Fri, 27 Jul 2018 13:39:18 +0200
    From:        Edgar =?iso-8859-1?B?RnXf?= <ef%math.uni-bonn.de@localhost>
    Message-ID:  <20180727113917.GD48007%trav.math.uni-bonn.de@localhost>

  | It has been brought to my attention that quoting the "word" in sh's 
  | substring processing causes word to be matched literally rather than 
  | being treated as a pattern.

Yes.   Or rather, more accurately, it is still treated as a pattern,
but one with no meta-characters (everything is a literal) - just as
in regular experssions (which shell patterns are not) the R.E.
	name
is a perfectly valid RE ("grep name file..." works) -  just a kind of
boring one.

  | 	x="abc"
  | 	y="?"
  | 	echo "${x#"$y"}"
  | outputs "abc", while
  | 	x="abc"
  | 	y="?"
  | 	echo "${x#$y}"
  | outputs "bc".

Yes.  But those are the simple cases.   The recent pattern
changes to sh deal with what happens when 
	y='\?'
when the same rule you expressed applies, but in a much
messier context.   This is where sh used to not perform
very well (and in HEAD is now, I think, much better) and
many (perhaps most) other shells also have "issues".

  | I can't see this behaviour specified by SUS

There is a work item to improve the way that pattern
matching is specified in the next edition of the posix
spec (some of what is there now is wrong, worse than
just missing).   How effective this will turn out to be is
yet to be seen (there are people who prefer to "fix" things
in a way that requires minimal changes, rather than just
ripping out what is there and replacing it with something
better, which is what this really needs.)

  | nor mentioned in sh(1).

I have uncommitted changes to the pattern section of
sh(1) which I hope will eventually improve things there.

I am not yet really happy with the new wording though,
so they remain uncommitted (as in previous episodes,
I prefer writing C to English...)

  | bash and ksh seem to behave the same.

Yes, this has never really been in doubt, it has been that
way ever since ksh added the # and % operators -- and
chamged the quoting rules inside var expansions to the
rational form that you showed above from the irrational
that was, probably from PDP-11 space limitations, in the
original Bourne shell, and remains to this day, for all the
other forms ( "${var-"word"}" means, as far as quoting is
concerned, something totally different than "${var#"word"}" )

kre



Home | Main Index | Thread Index | Old Index