Subject: signal confusion
To: None <current-users@netbsd.org>
From: Paul B Dokas <dokas@cs.umn.edu>
List: current-users
Date: 11/05/1998 13:09:16
I having an interesting problem WRT signal handling on NetBSD.  What
I wanted to do was have a program that spawns off N children and then
continues other work.  When a child dies, I want to recieve a SIGCHLD,
wait() for the child and then continue on.  This works fine in about 95%
of all cases, but for that remaining 5%, the signals are disappearing.

What appears to be happening is this:  if two children die at almost
the same instant then the first child to die will queue up a SIGCHLD,
but when the second dies, the kernel notices that a SIGCHLD is already
pending and then throws out the second SIGCHLD.

Isn't this exactly the case that POSIX signals were designed to avoid
(the surpising lose of signals)?


I'm attaching a quick hack of a program to illustrate this "problem".
What I want is for this program to return and say that 3 interrupts were
caught.  It *always* says that only 1 was delivered.


Is something really wrong here, or am I just crazy and misinterpreting
what POSIX signals are designed to do.


BTW, I've other wise solved this problem in my program by rewriting it.
Expect a psh type program (see the thread of about 2 weeks ago) soon!

Paul



#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/wait.h>


int		nprocs = 3;

volatile int	child_waiting = 0;
volatile int	interrupts = 0;


void	_reap_child(int sig);


void main(void)
{
  struct sigaction	act, oact;
  sigset_t		sig_mask, osig_mask;

  pid_t			pid;

  int			i;


  /* setup a signal handler for reaping children */
  act.sa_handler = _reap_child;
  sigemptyset(&act.sa_mask);
  act.sa_flags = SA_NOCLDSTOP;
  act.sa_flags = 0;
  if (sigaction(SIGCHLD, &act, &oact) != 0)
    {
      perror("sigaction(SIGCHLD)");
      exit(errno);
    }


  /* treat the entire program as a critical section */
  sigemptyset(&sig_mask);
  sigaddset(&sig_mask, SIGCHLD);
  sigemptyset(&osig_mask);
  if (sigprocmask(SIG_BLOCK, &sig_mask, &osig_mask) != 0)
    {
      perror("sigprocmask(SIG_BLOCK, SIGCHLD)");
      exit(-1);
    }


  /* fork off nprocs but only allow nprocs to run at a time */
  for (i = 0; i < nprocs; i++)
    {
      /* fire off a child */
      printf("\nspawning...\n");
      printf("\tchild_waiting = %d\n", child_waiting);
      printf("\tinterrupts = %d\n", interrupts);
      pid = fork();
      if (pid > 0)
        {
          printf("spawned child #%d as pid %d\n", i, pid);
        }
      else if (pid == 0)
        {
          printf("child #%d is about to sleep\n");
          sleep(1);
          printf("child #%d is exiting\n");
          exit(0);
        }
      else
        {
          perror("fork()");
          exit(-1);
        }
    }

  printf("\nall children have been forked\n");

  printf("\n\nwaiting...\n");
  sleep(5);
  printf("\n\nOk, where are we?\n");


  /* done with the critical section, unmask interrupts */
  if (sigprocmask(SIG_SETMASK, &osig_mask, (sigset_t *) NULL) != 0)
    {
      perror("sigprocmask(SIG_SETMASK, osig_mask)");
      exit(-1);
    }


  sleep(1);
  printf("child_waiting = %d\n", child_waiting);
  printf("interrupts = %d\n", interrupts);


  /* done */
  exit(0);
}


void _reap_child(int sig)
{
  /* bump the counters */
  child_waiting++;
  interrupts++;

  /* done */
  return;
}

--
Paul Dokas                                            dokas@cs.umn.edu
======================================================================
Don Juan Matus:  "an enigma wrapped in mystery wrapped in a tortilla."