Subject: sh weirdness with subshells
To: None <tech-userlevel@NetBSD.ORG>
From: Ken Hornstein <kenh@cmf.nrl.navy.mil>
List: tech-userlevel
Date: 02/16/1995 15:13:25
I just ported the new version of HylaFAX (Sam Leffler's fax package) to
NetBSD 1.0.  Everything went fine, except the script to add a new modem acted
a little strange.

The script (called faxaddmodem) during part of the procedure checks for
the capabilities of the fax modem by talking out of the serial port.  During
this it spawns off a subshell to warn the user if the modem is hung.  Once
it gets a response back from the modem, it kills off the subshell.  But on
NetBSD with the default /bin/sh, it doesn't - it keeps running over and over
even after the parent script has exited.

I tracked it down to this.  Try running the following script from within
/bin/sh:

(while true; do sleep 10; echo ping; done) & echo $!

You'll get a PID back, like 11385.  ps shows:

kenh     11384  0.0  0.0   300  172 p2  S     3:08PM    0:00.06 sh
kenh     11385  0.0  0.0   292   72 p2  I     3:08PM    0:00.01 sh
kenh     11386  0.0  0.0   296  128 p2  S     3:08PM    0:00.03 sh

11384 is the parent, and 11385 is obviously the subshell.  But what's this
extra shell with a pid of 11386?  If you kill 11385, you get:

[1] 11385 Terminated          (while true; do sleep 10; echo ping; done)

But process 11386 still stays around, and it is actually the process that's
doing the work.  You'll still get a "ping" every 10 seconds until you finally
kill process 11386.

So, is this a bug in the NetBSD Bourne shell, or perhaps a feature?  This
doesn't happen when using /bin/sh under SunOS, for example.

(I admit, it's rather picky, but I don't feel justified in going back to Sam
Leffler with this if it's a bug in the NetBSD shell).

--Ken