Subject: possible bug in NetBSD asynchronous I/O
To: None <tech-kern@netbsd.org>
From: Emmanuel Dreyfus <p99dreyf@criens.u-psud.fr>
List: tech-kern
Date: 04/28/2001 12:55:16
Hello everybody

With the help of Kevin B. Hendricks, I'm tracking down a bug in Linux
emulation that hangs the JDK on the PowerPC on some rare operations (so
far, this bug pops up when building Ant, and also probably when running
Tomcat)

Here is the problem: We use async I/O to send data from one process to
another through a pipe. It seems that when the receiving process reads
some data from the pipe, the sending process gets a unexpected SIGIO.
This confuses the JDK.

There is a test program that reveals the problem at the end of this
message. I wonder if this couldn't be more global than just an emulation
problem. I ran it as a native binary on Linux, NetBSD, FreeBSD, OpenBSD,
and Solaris, and NetBSD is the only OS out of there with this behavior.
All other OSes pass the test successfully.

I don't know what is the standard behavior here, or even if the behavior
is standardized, but it seems we are not behaving like other UNIXes
here. Could we have a standard conformance problem? I mean is it a bug
or a feature? Should this be fixed only for Linux binaries (because this
is a bug for Linux binaries), or should it be fixed at a lower level,
for all emulation packages?

/* sigio.c -- tests for a possible NetBSD async I/O bug */
#include <signal.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#ifdef linux
#include <wait.h>
#include <error.h>
#endif
#ifdef __NetBSD__
#include <sys/wait.h>
#endif
#include <fcntl.h>
#include <sys/fcntl.h>
#include <errno.h>

static void 
io_sighandler(int sig)
{

  fprintf(stderr,"##### %d: We got sigio .. in handler\nTEST FAILED\n",
getpid());
  fflush(stderr);
  exit (-1);
}


int 
main(int argc, char** argv)
{
    struct sigaction chld;
    struct sigaction aio;
    pid_t pid;
    int   status;
    int fdsync[2];
    int err;
    char c;
    sigset_t set;

    sigemptyset(&set);
    sigaddset(&set,SIGIO);

    fdsync[0] = fdsync[1] = -1;
    if (pipe(fdsync) < 0) { /* fd for synchronization */
      fprintf(stderr, "Error: bad pipe call\n");
      exit(1);
    }

    aio.sa_flags = SA_RESTART;
    aio.sa_handler = io_sighandler;
    sigemptyset(&aio.sa_mask);
    if (sigaction(SIGIO, &aio, 0) == -1) {
        fprintf(stdout,"Error: Bad return value from sigaction call\n");
        exit(1);
    }

    /* now set the pipe write end to be non-blocking async */
    fcntl(fdsync[1],F_SETFL, O_NONBLOCK | FASYNC);
    fcntl(fdsync[1],F_SETOWN, getpid());

    fprintf(stdout,"going to fork\n");
    fflush(stdout);
    if ((pid = fork()) == 0) {
        /* Child process XXX */

        /* wait until parent is ready before preceding */

        printf("doing err = read(fdsync[0], &c, 1);\n");
        err = read(fdsync[0], &c, 1);

        exit (0);
    } else {
       /* the parent */

       fprintf(stdout,"parent: pid=%d\n", getpid());
       fprintf(stdout,"going to wait\n");
       fflush(stdout);

       /* tell child we are ready to go */
       err = write(fdsync[1], "AAAA", 4);
       if (err < 0) {
          fprintf(stderr,"got err=%d\n", err);
          fflush(stderr);
       }

            /* pause(); */
       sleep (10);
       fprintf(stdout,"reaped pid %d\nTEST SUCCESSFUL\n",pid);
       fflush(stdout);

       exit(0);
       }
}


-- 
Emmanuel Dreyfus
p99dreyf@criens.u-psud.fr