Subject: Re: make release abends with "*** Error code 1" SOLVED
To: None <netbsd-help@netbsd.org>
From: Woodchuck <djv@bedford.net>
List: netbsd-help
Date: 01/22/2007 02:39:33
The problem turns out to have been due to shell.

In the command referenced upthread, sh is executed with the -e
switch, which means it exits if any program it runs (or *any internal
command*) returns error.

At the time, ksh was root's shell.  Root had a .profile, which
duly set and exported the ENV environment variable to ${HOME}/.kshrc.

The .kshrc also existed, and containted this line:
	. /etc/ksh.kshrc

and a couple of aliases and exports of its own.

In /etc/ksh.kshrc (not distributed by default with NetBSD, but it
is distributed as part of the basic pdksh distribution, and is
default on OpenBSD, from whence I moved it to Net after installing
Net on a former OpenBSD box), are a bunch of rubbishy commands and
even function definitions.

Long ago, when changing from bash to ksh (ten years?) I discovered
I did not like the semantics of ksh's login "internal"; ksh defines
a default alias "login="exec login"", i.e. it terminates an existing
shell when you login to some other username.  I prefer to have it
nest that new login shell, so that when ending it with ^D, I am
returned to the parent login session.  -- SO -- in that /etc/ksh.kshrc,
I had the innocent line
        "unalias login"

Normally, this works fine.

However, sh has its own ideas.  It does not define a default alias
for login, so that when sh is invoked, login is unaliased.  Mirabile
dictu, unalias returns an error code if its operand is not aliased.
This caused the "sh -e" in the make release suite to abend, as it
should have.  "unalias <name of something not an alias>" emits
no error message, but does set a pseudo-return value to 1.  ("Pseudo"
because it does not execute as a real process.) (Perry -- this is
a minor (un-fixworthy) example of information swallowing.)

Removing /etc/ksh.kshrc, /root/.kshrc and /root/.profile, setting
root's default shell to /bin/sh, and repeating the build.sh release
drill led to success.

There was one other glitch/failure, unrelated.  This was due to my
having included the cgd devices in /usr/src/..../GENERIC.local and,
zealously, in .../INSTALL.local.  This resulted in ramdisk images that
"would not fit on 2 floppies" (for future googlers).  Restoring those
files to their original state (empty save for comments) resulted in
a successful "build.sh release".

the final run:
         build.sh started: Sun Jan 21 12:02:19 EST 2007
         build.sh ended:   Mon Jan 22 02:11:34 EST 2007
This did not involve recompiling tools or userland, just the make
overhead and compiling fourteen kernels.  the XEN kernel makes swapped
up to 150MB.  Something is very tricksy in them.

My sincere thanks to those who prodded me along.

Astoundingly, just last week I make a bug report on OpenBSD about
ksh.kshrc causing make problems if ksh.kshrc write to standard output.

I think it's time I did something about that.

The first thing to do is bracket the interior of .kshrc with

	if [ -o interactive ]; then
		. /etc/ksh.kshrc
		<other stuffs>
	fi
	<stuff that just has to be present for every shell>

and to weed out the loathsome cruft in /etc/ksh.kshrc.

The second thing to do is to systematically figure out what sh and ksh
think "interactive shells" are; the documents being inferior to the source,
and the interaction with the ENV environment variable being an issue.

Again, thanks, and Good Night!

Dave