tech-userlevel: Re: a proposal for two new libc functions: shquote() and shquotev()

Subject: Re: a proposal for two new libc functions: shquote() and shquotev()
To: Christos Zoulas <christos@zoulas.com>
From: Chris G. Demetriou <cgd@netbsd.org>
List: tech-userlevel
Date: 03/03/2001 13:31:06
christos@zoulas.com (Christos Zoulas) writes:
> >On Fri, Mar 02, 2001 at 10:58:48PM -0800, Chris G. Demetriou wrote:
> >> christos@zoulas.com (Christos Zoulas) writes:
> >> > How about an allocating version like asprintf()? That would cut down your
> >> > example significantly.
> >> 
> >> I can imagine some trivial cases where that'd help, when _all_ of your
> >> arguments are coming from the argc/argv given to shquotev().
> 
> Yes, so? I would give all my arguments to shquotev... Remember 'command' gets
> eval'ed too. I would think that most of the time would I want to quote all the
> arguments.

Yes.  Command gets eval'd.  That's exactly the point!  You do _not_
want to quote it.

If in fact you have all of your arguments in their final forms, in
separate strings, and want to do evaluation or expansion, you should
be using exec*() of some variety.

The purpose of this, is, basically, to allow code which uses
environment variables for programs to be able to use multi-word,
evaluated, environment variables for programs.

For instance, 'mkdep' can use the 'CC' environment variable.
In some instances, it's awfully useful to be able to say:

	CC='compiler -option1 -option2'

It can get worse, e.g.:

	CC='compiler -Dfoo="foo bar"'

and have the programs that use the environment variable Do The Right
Thing.

At least to me, not only is it useful behaviour but it's also the most
intuitive behaviour.


There are two paths to take to get this behaviour:

(1) do something like i've proposed in the examples, where you pass a
    variable reference to be interpreted by the shell to the shell.

(2) in the program, split up the environment variable according to
    shell rules, stitch the resulting args into the args that you're
    going to pass to the program, and do it.


Each has its plusses and minuses.

(1) + easy to use

    + don't have to replicate shell parsing code

    - means you need to invoke the shell.  (system and popen are kinda
      annoying, but you could avoid them if you had the desire.  You'd
      still need to invoke the shell.)

    - Expands arguments.  If you're near the limit of max arg size
      coming in, it could cause lossage.  (For the cases I looked at,
      this didn't seem to be a big problem.)

(2) - hard to use: need to worry about stitching the new args into the
      list you're going to pass, will need to worry about freeing all
      of them.  (This would be a significant headache, in my opinion,
      for the code I looked at.)

    - have to create code that parses shell quoting and field
      splitting.

    + don't have to invoke the shell.

    + Doesn't expand arguments (over and above what's involved in
      expanding env var references for the program, which is the
      user's problem).


I considered both ways.

I think both have their points, and indeed, their existence is not
mutually exclusive (though if we're going to do something like this,
we should try to advocate a 'better' one).

Based on the issues around actualy implementation, and the fact that I
don't think it has significant disadvantages for the places where I'm
interested in using it, I suggested the former.

It would be fairly difficult to convert some programs to use the
latter method since their argument handling is ... interesting.
Perhaps it should be done, though...  It causes a lot more potential
for lossage, though.


cgd
-- 
Chris Demetriou - cgd@netbsd.org - http://www.netbsd.org/People/Pages/cgd.html
Disclaimer: Not speaking for NetBSD, just expressing my own opinion.