Subject: Re: a proposal for two new libc functions: shquote() and shquotev()
To: Chris G. Demetriou <cgd@netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-userlevel
Date: 03/04/2001 10:37:09
> (2) despite the additional difficulties, splitting is better than
>     quoting, if you can reasonably demand that the values of the
>     relevant environment variables (used for command names w/ possible
>     options) be in ASCII, or

Splitting has the problem that you need to know a lot more about shell
syntax.  (A "shsplit()" function would help, but you then get into the
neverending question of "how much shell do you need to implement").

> (3) you've gotta bite the bullet and do this multibyte...
> 
> If (3), splitting probably better than quote-and-hand-to-/bin/sh,
> because /bin/sh isn't multibyte-char aware!

> Thoughts?

The point of shquote() (i.e., the "contract" with the programmer using
it) is to match the conventions of the shell used by popen() and
system(); if we get a multibyte-aware /bin/sh, shquote will need to be
multibyte aware.  A shquote() portable among systems with and without
multibyte-aware shells will need to behave in a way which matches the
system its running on.

> 	if we do not use mbrtowc() over localized string, we will make mistakes
> 	because some of stateful encodings include "$" and "\" in multibyte
> 	character streams (they are part of multibyte stream, so they should 
> 	not be escaped).

I'll note in passing that, on a -current system with mbrtowc() in
wchar.h, that there is no man page installed for mbrtowc() (and
possibly not for any of the other wchar.h functions); this makes it
difficult for someone unfamiliar with these API's to learn them on
NetBSD and start writing multibyte-aware code..

					- Bill