Subject: bin/1880: mail has a too-limited implementation of string quoting
To: None <gnats-bugs@gnats.netbsd.org>
From: James E. Bernard <jbernard@geek.mines.edu>
List: netbsd-bugs
Date: 12/31/1995 18:07:37
>Number:         1880
>Category:       bin
>Synopsis:       mail has a too-limited implementation of string quoting
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people (Utility Bug People)
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Sun Dec 31 20:20:03 1995
>Last-Modified:
>Originator:     Jim Bernard
>Organization:
	Speaking for myself
>Release:        1.1
>Environment:
System: NetBSD zoo 1.1 NetBSD 1.1 (ZOO) #0: Sun Dec 3 12:56:42 MST 1995 local@zoo:/home/local/netbsd-1.1/usr/src/sys/arch/i386/compile/ZOO i386


>Description:
	Strings in ~/.mailrc cannot be protected from metacharacter interpretation.
	For example, I find it useful to pass egrep-style regular expressions to
	the PAGER, such as:
		set PAGER="less -i -c -p'^Message |^To:|^From:|^Subject:'"
	but mail insists on interpreting ^M, ^T, ^F, and ^S as control characters
	unless excessive amounts of escaping are used, e.g.:
		set PAGER="less -i -c -p'\\\^Message |\\\^To:|\\\^From:|\\\^Subject:'"
	(which achieves the desired effect, but is horrible and prevents sharing
	of ~.mailrc with sunos mail).
>How-To-Repeat:
	Install less-290, patched to accept extended regular expressions:

--- search.c-dist	Thu Mar  9 23:04:24 1995
+++ search.c	Wed Dec 27 13:30:42 1995
@@ -248,7 +248,7 @@
 {
 #if HAVE_POSIX_REGCOMP
 	regex_t *s = (regex_t *) ecalloc(1, sizeof(regex_t));
-	if (regcomp(s, pattern, 0))
+	if (regcomp(s, pattern, REG_EXTENDED))
 	{
 		free(s);
 		error("Invalid pattern", NULL_PARG);

	Add to your ~/.mailrc the following lines:
		set crt=22
		set PAGER="less -i -c -p'^Message |^To:|^From:|^Subject:'"
	and try to read a mail message longer than 22 lines.  Exciting things
	will happen, including creation of a log file whose name begins "sage ".

	Alternatively, more can be used, also patched to support extended
	regular expressions (pr previously submitted on this):

--- prim.c-dist	Fri Oct 13 21:19:06 1995
+++ prim.c	Thu Dec 28 18:28:02 1995
@@ -640,7 +640,7 @@
 		}
 		else
 			regfree(cpattern);
-		if (regcomp(cpattern, pattern, 0))
+		if (regcomp(cpattern, pattern, REG_EXTENDED))
 		{
 			error("Invalid pattern");
 			return(0);
>Fix:
	The problem is that getrawlist (used both to split strings from
	the ~/.mailrc file and to split args to exec external programs)
	always processes metacharacters regardless of whether single or
	double quotes are used.  The patch below adds support for literal
	processing of text between single quotes, whether they appear as
	outer quotes, or embedded within text between double quotes.  If
	the single quotes are embedded in text between double quotes, they
	are preserved in the output; that is, exactly one layer of quoting
	is stripped when getrawlist processes a string.  (In the above
	example, the double quotes are stripped off when ~/.mailrc is read,
	and the single quotes are stripped off just prior to invoking PAGER.)

--- list.c-dist	Fri Oct 13 21:16:01 1995
+++ list.c	Thu Dec 28 16:28:19 1995
@@ -391,6 +391,7 @@
 	int  argc;
 {
 	register char c, *cp, *cp2, quotec;
+	int quotel;
 	int argn;
 	char linebuf[BUFSIZ];
 
@@ -408,12 +409,22 @@
 		}
 		cp2 = linebuf;
 		quotec = '\0';
+		quotel = -1;
 		while ((c = *cp) != '\0') {
 			cp++;
-			if (quotec != '\0') {
-				if (c == quotec)
-					quotec = '\0';
-				else if (c == '\\')
+			if (quotel != -1) {
+				if (c == quotec) {
+					if (quotel-- != 0) {
+						*cp2++ = c;
+						quotec = '"';
+					} else
+						quotec = '\0';
+				} else if (quotec == '\'')
+					*cp2++ = c;
+				else if (c == '\'') {
+					*cp2++ = quotec = c;
+					quotel = 1;
+				} else if (c == '\\')
 					switch (c = *cp++) {
 					case '\0':
 						*cp2++ = '\\';
@@ -463,9 +474,10 @@
 					}
 				} else
 					*cp2++ = c;
-			} else if (c == '"' || c == '\'')
+			} else if (c == '"' || c == '\'') {
 				quotec = c;
-			else if (c == ' ' || c == '\t')
+				quotel = 0;
+			} else if (c == ' ' || c == '\t')
 				break;
 			else
 				*cp2++ = c;

>Audit-Trail:
>Unformatted: