NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/55979: sh single quotes removes nul characters



The following reply was made to PR bin/55979; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/55979: sh single quotes removes nul characters
Date: Sun, 07 Feb 2021 01:03:16 +0700

     Date:        Sat,  6 Feb 2021 12:10:00 +0000 (UTC)
     From:        jtunney%gmail.com@localhost
     Message-ID:  <20210206121000.CFF391A923D%mollari.NetBSD.org@localhost>
 
   | /bin/sh and /bin/ksh remove ASCII NUL characters embedded in single
   | quoted strings. This is inconsistent with the behavior of shells on
   | other platforms. POSIX requires this content be preserved:
 
 I doubt that, and I will look and see if I can find where it explicitly
 says differently, later.
 
   | This use case is supported by POSIX.
   |
   |     "The input file may be of any type, but the initial portion of the
   |      file intended to be parsed according to the shell grammar (XREF to
   |      XSH 2.10.2 Shell Grammar Rules) shall consist of characters and
   |      shall not contain the NUL character. The shell shall not enforce
   |      any line length limits."
 
 Appending stuff to the end of a script is supported (or should be), if
 we're not doing that correctly (which is possible, it is an unusual usage)
 then that should be fixed.   But note from what you just quoted (with
 the unimportant words for this purpose elided)
 
 		the initial portion of the file intended to be parsed
 		according to the shell grammar [...] shall not contain
 		the NUL character.
 
 That is, if a NUL is part of the script itself, then it is non-conforming
 (whereas whatever follows the script and is never parsed or executed does
 not have that requirement).
 
   |     http://austingroupbugs.net/view.php?id=1250
   |     http://austingroupbugs.net/view.php?id=1226#c4394
 
 Yes, I know those two, and neither has anything to do with (at least
 what I perceive to be) the issue raised by this PR.
 
   | FreeBSD /bin/sh was recently updated to incorporate this change:
 
 I will take a look at what they did.
 
   | Could NetBSD update its /bin/sh shell?
 
 It depends just what is really required.
 
   | printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/sh | hexdump -C
   | 00000000  01 01                                             |..|
   |
   | printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | /bin/ksh | hexdump -C
   | 00000000  01 01  
                                            |..|
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | ksh93 | hexdump -C
 ksh93: syntax error at line 1: `zero byte' unexpected
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | bosh | hexdump -C
 bash5 $ 
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | dash | hexdump -C
 00000000  01 01                                             |..|
 00000002
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | mksh | hexdump -C
 00000000  01 01                                             |..|
 00000002
 bash5 $ printf "x='\1\0\1'\nprintf '%%s'"' "$x"\n' | yash | hexdump -C
 syntax error: the single quotation is not closed
 
 Which shell (apart from zsh which is decidedly a non-posix shell)
 supports that?
 
 I don't have a (current) FreeBSD sh to test at the minute.
 
   | >Fix:
   | Possibly changing something to do with `sqsyntax` or `readtoken1`
   | in your Almquist Shell fork in bin/sh/parse.c
 
 It would be much more than that, the shell uses standard C strings
 (char *, terminated by \0) everywhere internally, it would be major
 work to make it handle a \0 embedded in a variable value, or similar.
 
 It is simply impossible to embed \0 in a command arg or environment
 variable, the formats of those things are defined to be \0 terminated
 strings.
 
 Since bash does not actually allow \0 in sh input (as I recall, it discards
 NUL chars, just as we do), and yet your script works with bash, I assume
 that the actual issue is something different.   If I can work out what
 that is, and a fix is reasonable to implement, I will see what I can do.
 
 kre
 


Home | Main Index | Thread Index | Old Index