NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/50993: /bin/sh heredoc parsing done incorrectly

>Number:         50993
>Category:       bin
>Synopsis:       /bin/sh heredoc parsing done incorrectly
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 23 06:05:00 +0000 2016
>Originator:     Robert Elz
>Release:        NetBSD 7.99.26 (all versions to current as of date of PR)
System: NetBSD 7.99.26 NetBSD 7.99.26 (VBOX64-1.1-20160128) #43: Thu Jan 28 16:09:08 ICT 2016 amd64
Architecture: x86_64
Machine: amd64
	The following script ...

	cat <<- EOF
			cat << STOP 
					echo help

	literally as it appears here (including all leading tabs)
	should simply say "help" on stdout.  It doesn't, instead it
	generates a syntax error.

	Extract the script above (from the "cat <<- EOF" line down to the
	line containing just (tabs)EOF) being careful to keep all leading
	whitespace as is (and it is all exclusively tab characters - that
	is important - the only spaces in any line occur after a
	non-whitespace character - it doesn't actually matter how many tabs
	appear on each line, as long as there is at least 1).  Newlines
	must follow immediately after the last non-whitespace character on
	every line - no lines have trailing spaces or tabs (this only really
	matters on the lines that contain only STOP and END).

	Put it in a file called "filename" (ie: you pick the name)
	Run it as a script "sh filename".   If that results in "help"
	on stdout, and an exit code of 0 from the shell, all is good.
	If there are errors on stderr, an exit code != 0, and no "help"
	(or any of those three) then it is broken.

	The underlying cause is the way that the shell parses just about
	everything - as soon as it sees something (like the command
	substitution $( ) inside the outer here doc, it starts parsing that).
	Leading tabs on lines do not get stripped while parsing command
	lists for a command substitution.  Then when the inner here doc
	(which does not specify tab stripping) is encountered, the STOP
	line is never located because of the leading tabs, which results in
	a syntax error (as the ')' that terminates the outer command
	substitution is also not found - having been bypassed searching for
	STOP, nor is the "EOF" - until it finds "STOP" at beginning of
	line, that inner here doc will keep being read - until EOF,

	The script is easy to make work, just either move the STOP to
	the left margin (which will result in output that contains tabs
	before "help", which is not correct really, but is better), or
	make the "cat << STOP" be "cat <<- STOP" so the inner here doc
	also strips tabs.

	Neither should be necessary, as the outer here doc processing is
	supposed to have stripped all the leading tabs from lines down to,
	and including, the EOF line.  So the STOP should already be at the
	left margin when it is needed.

	Other shells (bash, ksh, zsh) do this correctly (FreeBSD, whose shell
	is very similar to NetBSD's of course) does not, nor does dash.
	(Some other less common shells do even worse...)

	To fix this the way here doc processing is performed is going to
	need radical surgery inside the parser, which is not going to
	happen overnight - I do have some ideas to try however.

	It is possible there are related problems with "..." strings,
	and perhaps more (those are also parsed more than they should
	be early in the processing) but if so, I have not yet found the
	test case to exhibit a bug (that is, that the code does not work
	quite the way the spec (posix) designates, may be benign in that

Home | Main Index | Thread Index | Old Index