Re: Behaviour in shell script with job control enabled

To: Silas Silva <silasdb%gmail.com@localhost>
Subject: Re: Behaviour in shell script with job control enabled
From: Robert Elz <kre%munnari.OZ.AU@localhost>
Date: Thu, 07 Jul 2016 06:39:29 +0700

    Date:        Wed, 6 Jul 2016 16:28:28 -0300
    From:        Silas Silva <silasdb%gmail.com@localhost>
    Message-ID:  <20160706192828.GA8086%auron.ufabc.int.br@localhost>

  | Why enabling (or not) job control (-m) should change the script
  | behaviour?

I think you're seeing a weird interaction between ping6 and the way
the shell detects SIGINT when job control is enabled ...

When job control is enabled, child processes (sets of processes in
general, but here just ping6) are run in a seperate process group, and
if the job (process) is in the foreground (as here) the controlling
terminal is set to the process group of the child job (process).

The effect of this is that the shell does not receive any terminal
generated signals while the process is running (it is not in the correct
process group).   In general that would be fine, but if you were to ^C the
ping, you'd normally expect the shell to receive the ^C as well, and stop
whatever it is doing (in a script, assuming there's no SIGINT trap - just
to be complete, that's not relevant here) the shell exits, in an
interactive shell, it flushes pending commands and prints a new prompt
(including a \n in case the command ended after printing a half line of
output - that's the cause of the blank line.)

To offset this effect, the shell looks to see if the process it ran exited
because of a received SIGINT, and if so, simply assumes that you typed ^C
and acts just as if you had.

In a shell without job control the shell, and the processes it runs are
all in the same process group, and the shell thus receives the same signals
as the process, and does not need that hack.

The interaction with ping6 (most other commands you could run would not
act the way your are observng) is that when ping6 exits after a caught signal
(any signal), it resets its SIGINT handler to SIG_DFL, and then sends itself
a SIGINT.   That makes its exit status be identical to what would happen
if it never trapped SIGINT in the first place, and was killed by ^C.

If the signal that caused it to exit was actually a SIGINT, all is fine,
and the right things happen.    But in this case the signal that is killing
ping6 is a SIGALRM that goes off to terminate ping (because of the -c1)
when there is no reply.

So, ping6 gets SIGALRM, sends itself SIGINT, that causes it to exit in
a way that looks just like it received ^C, the shell with job control
enabled observes that exit status from ping6, concludes a ^C must have
happened, and aborts what it is doing (just printing the \n).

You should probably file 2 PRs about this .. one on the shell that I will
look into -- but as you saw, the NetBSD shell is not the only one to act
this way - determining the difference between a received ^C and a process
that suicides by doing "kill(getpiid(), SIGINT);" is not easy - and when
using job control it is difficult (at best) to detect a typed ^C any other
way.   That is, this might not get fixed any time soon.  It might be possible
to have the shell enter and remain in the process group of a foreground child
while it is being created and then while waiting for it to exit (as long as
it does not stop) and so detect the signal itself.   But that might cause
other problems so would need some careful analysis before making a change
like that.   (If I had to guess, I'd assume something like that is what
bash is doing - but bash is much more complex internally, and bigger and
slower...)

The other PR should be about ping6 - unless it is actually exiting because
it received a terminal generated SIGINT it should not really be sending
itself SIGINT to kill itself, a simple non-zero exit status would do for
all the other cases.    This one should be easy to fix, and that's likely
to fix your immediate problem.

kre

References:
- Behaviour in shell script with job control enabled
  - From: Silas Silva

Prev by Date: Re: grub2 and NetBSD on entire second disk
Next by Date: NetBSD and bridge(4)?
Previous by Thread: Behaviour in shell script with job control enabled
Next by Thread: NetBSD and bridge(4)?
Indexes:

Home | Main Index | Thread Index | Old Index