Subject: Re: signal confusion
To: Paul B Dokas <dokas@cs.umn.edu>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: current-users
Date: 11/05/1998 15:10:03
> I having an interesting problem WRT signal handling on NetBSD.  What
> I wanted to do was have a program that spawns off N children and then
> continues other work.  When a child dies, I want to recieve a
> SIGCHLD, wait() for the child and then continue on.

The basic idea is sound and can be made to work.

> This works fine in about 95% of all cases, but for that remaining 5%,
> the signals are disappearing.

> What appears to be happening is this:  if two children die at almost
> the same instant then the first child to die will queue up a SIGCHLD,
> but when the second dies, the kernel notices that a SIGCHLD is
> already pending and then throws out the second SIGCHLD.

Right.  This is the way signals have always worked, I believe.

What you need to do is something like this:

static volatile sig_atomic_t gotsigchld = 0;

static void sigchld_handler(int sig) { gotsigchld = 1; }

static void waitforkids(void)
{
	pid_t pid;
	int status;

	while ((pid=wait3(&status,WNOHANG,0)) > 0) {
		...process death, using pid and status...
	}
}

... in your main loop ...
	if (gotsigchld) {
		gotsigchld = 0; /* *before* doing any waits! */
		waitforkids();
	}
... plus, make sure SIGCHLD will cause that if to happen "soon".

> Isn't this exactly the case that POSIX signals were designed to avoid
> (the surpising lose of signals)?

As I indicated above, AFAIK signals have always worked this way.  If
you want to generate thingies that never get lost even in the presence
of multiple thingies being generated in quick succession, signals
simply are not an appropriate way to implement thingies.

					der Mouse

			       mouse@rodents.montreal.qc.ca
		     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B