NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bin/55979: sh single quotes removes nul characters



The following reply was made to PR bin/55979; it has been noted by GNATS.

From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: 
Subject: Re: bin/55979: sh single quotes removes nul characters
Date: Sun, 07 Feb 2021 02:20:18 +0700

 OK, I see the issue now, and as I suspected, it has noting whatever
 to do with NUL characters in single quoted strings.
 
 One of the issues with shells need to deal with, is that when they
 see an executable file (ie: 'x' permission set, and not a directory)
 that the system cannot actually execute (execl() fails with ENOEXEC)
 what do they do.
 
 The traditional behaviour was simply to assume that the file is a
 shell script, and attempt to parse and execute it.   That's what
 the Thompson shell did, and early versions of the Bourne shell,
 and is what allowed shell scripts to "pretend" to be commands in
 the days before #! support was added.   For the rest of this we
 will forget about #!, as while it makes it simpler to make scripts
 (sh, awk, perl, ...) work, such they can be executed by any other
 program using execl() (or one of its variants), the existence of
 this facility didn't change the way that shells work at all, it just
 made it less likely that the execl() would fail.
 
 Any random file with 'x' permission, which wasn't an actual executable
 binary, was run as a script - which is fine when it was a sh script,
 but irritated users with (typically many) error messages when it was
 not.   So, shells grew some "smarts" and attempted to detect which files
 were scripts, and which were not, using heuristics to tell the difference.
 
 A common method, the one dealt in the austin group POSIX defect reports
 you cited, was simply to look for a \0 in the initial part of the file
 (the first buffer read, of whatever size the shell reads chunks of files).
 
 That works for detecting binary files, usually, but doesn't allow the
 leading script, followed by other data, that we actually want to allow,
 so the heuristic was changed to look for a \0 in the first line of the
 file (that is, a \0 before a \n).
 
 That's what the FreeBSD change you mention does - though it is actually
 more restrictive than that, if there is a \0 anywhere in the first
 block of the file, it requires there be (at least one) lower case
 ascii alpha, or a '$' or a '`', and a subsequent \n, before the first \0
 is found.   Previously (before that change) the treated any file
 containing a \0 in the first block of the file as binary.
 
 The NetBSD /bin/sh (I haven't looked at what /bin/ksh does) is actually
 far more permissive in this area than most other shells.   It forbids
 just one thing from being treated as a shell script, which is an ELF
 binary file, as that, we have found, is the most common kind of file
 to have 'x' permission, not be a script, and not actually be executable.
 That is, usually, ELF binaries for some other OS or architecture, that
 the kernel cannot simply run.
 
 Your hello.com is an ELF binary (it starts "\177ELF") which is exactly
 what we look for when deciding to reject the file.   We don't look for \0
 characters at all for this purpose, they're irrelevant.
 
 The code in sh is:
 
                         if (memcmp(magic, "\177ELF", 4) == 0) {
                                 (void)close(fd);
                                 error("Cannot execute ELF binary %s", fname);
                         }
 
 which is exactly what happens when sh is used to run your file.  That's
 our only check.
 
 Unless there turns out to be considerable support for altering that test
 (it is, I believe, the current heuristic after several previous attempts
 were less successful) I do not plan on doing so.
 
 kre
 
 


Home | Main Index | Thread Index | Old Index