Subject: a proposal for two new libc functions: shquote() and shquotev()
To: None <tech-userlevel@netbsd.org>
From: Chris G. Demetriou <cgd@netbsd.org>
List: tech-userlevel
Date: 03/02/2001 18:06:11
The following is a proposal for two new libc functions.  (Thanks to
Bill Sommerfeld for substantial comments which seriously improved
their definition.)

I keep running into the need for things like this, to make programs
which use environment variables that can specify programs to run be
more tolerant of those variables being multi-word.  Others appear to
run into similar issues time and again, so...


Thoughts?


cgd
============
This text Copyright 2001 Christopher G. Demetriou.  All rights reserved.
Feel free to reply to this message with quoted text, but don't try to
pull the text out of this message for your own manual page.  Once this
code is committed, it'll have a manual page, with normal license terms.

SHQUOTE(3)                NetBSD Programmer's Manual                SHQUOTE(3)

NAME
     shquote, shquotev - quote argument strings for use with the shell

LIBRARY
     Standard C Library (libc, -lc)

SYNOPSIS
     #include <stdlib.h>

     size_t
     shquote(const char *arg, char *buf, size_t bufsize);

     size_t
     shquotev(int arg_count, char * const *args, char *buf, size_t bufsize);

DESCRIPTION
     The shquote() and shquotev() functions copy strings and transform the
     copies by adding shell escape and quoting characters.  They are used to
     encapsulate arguments to be included in command strings passed to the
     system() and popen() functions, so that the arguments will have the cor-
     rect values after being evaluated by the shell.

     The exact method of quoting and escaping may vary, and is intended to
     match the conventions of the shell used by system() and popen().  It may
     not match the conventions used by other shells.  In this implementation,
     the following transformation is applied to each input string:

     o       dollar sign ($), backquote (`), double quote ("), and backslash
             (\) characters in the input are escaped by placing a backslash
             before them in the output,

     o       the result is then surrounded by double quotes ("), and

     o       any exclamation point (!) characters are replaced with the four-
             character sequence "\!".  (This is done to appease bash, which
             treats unescaped exclamation point characters within double
             quotes as an invocation of its command history mechanism.  This
             sequence closes the previous double-quoted string, provides an
             escaped exclamation point, then opens a new double-quoted
             string.)

     The shquote() function transforms the string specified by its arg argu-
     ment, and places the result into the memory pointed to by buf.

     The shquotev() function transforms each of the arg_count strings speci-
     fied by the array args independently.  The transformed strings are placed
     in the memory pointed to by buf, separated by spaces.  It does not modify
     the pointer array specified by args or the strings pointed to by the
     pointers in the arrary.

     Both functions write up to bufsize - 1 characters of output into the
     buffer pointed to by buf, then add a NUL character to terminate the out-
     put string.  If bufsize is given as zero, the buf parameter is ignored
     and no output is written.

RETURN VALUES
     The shquote() and shquotev() functions return the number of characters
     necessary to hold the result from operating on their input strings, not
     including the terminating NUL.  That is, they return the length of the
     string that would have been written to the output buffer, if it were
     large enough.

EXAMPLES
     The following code fragment demonstrates how you might use shquotev() to
     construct a command string to be used with system().  The command uses an
     environment variable (which will be expanded by the shell) to determine
     the actual program to run.  Note that the environment variable may be ex-
     panded by the shell into multiple words.  The first word of the expansion
     will be used by the shell as the name of the program to run, and the rest
     will be passed as arguments to the program.

           char **args, c, *cmd;
           size_t cmdlen, len;
           int arg_count;

           ...

           /*
            * Size buffer to hold the command string, and allocate it.
            * Buffer of length one given to snprintf() for portability.
            */
           cmdlen = snprintf(&c, 1, "${PROG-%s} ", PROG_DEFAULT);
           cmdlen += shquotev(arg_count, args, NULL, 0) + 1;
           cmd = malloc(cmdlen);
           if (cmd == NULL) {
                   ...
           }

           /* Create the command string. */
           len = snprintf(cmd, cmdlen, "${PROG-%s} ", PROG_DEFAULT);
           len += shquotev(arg_count, args, cmd + len, cmdlen - len);

           /* "cmd" can now be passed to system(). */

     The following example shows how you would implement the same functionali-
     ty using the shquote() function directly.

           char **args, c, *cmd;
           size_t cmdlen, len;
           int arg_count, i;

           ...

           /*
            * Size buffer to hold the command string, and allocate it.
            * Buffer of length one given to snprintf() for portability.
            */
           cmdlen = snprintf(&c, 1, "${PROG-%s} ", PROG_DEFAULT);
           for (i = 0; i < arg_count; i++)
                   cmdlen += shquote(args[i], NULL, 0) + 1;
           cmd = malloc(cmdlen);
           if (cmd == NULL) {
                   ...
           }

           /* Start the command string with the env var reference. */
           len = snprintf(cmd, cmdlen, "${PROG-%s} ", PROG_DEFAULT);

           /* Quote all of the arguments when copying them. */
           for (i = 0; i < arg_count; i++) {
                   len += shquote(args[i], cmd + len, cmdlen - len);
                   cmd[len++] = ' ';
           }
           cmd[--len] = ' ';

           /* "cmd" can now be passed to system(). */

SEE ALSO
     sh(1), popen(3), system(3)

NetBSD 1.5                       March 1, 2001                               2