Subject: Re: pipes from FreeBSD and the NetBSD async I/O bug
To: None <khendricks@ivey.uwo.ca, thorpej@zembu.com,>
From: Christos Zoulas <christos@zoulas.com>
List: tech-kern
Date: 05/01/2001 12:46:47
On Apr 30, 11:16pm, khendricks@ivey.uwo.ca (Kevin B.Hendricks) wrote:
-- Subject: Re: pipes from FreeBSD and the NetBSD async I/O bug

| I am confused.  I have never seen async io used alone, only with non-blocking 
| flag also set does it make sense.  IMHO You should not send a sigio to a 
| process that asked to write n bytes to an fd when set up for non-blocking 
| asynchronous io unless unless the write returned an error with errno set to 
| EAGAIN.
| 
| The sequence I expect under async and nonblocking write roughly looks 
| something like this:
| 
|    n = total number of bytes remaining to write
|    here:
|    c = attempted write of n bytes
|    if c > 0 then n = n -c;  goto here
|    if (c = -1 and errno = EAGAIN) then go do something else until sigio
|            comes in indicating you are ready to write again
| 
| With async and nonblocking I do not expect to get a sigio every time 
| something is read from the pipe unless the pipe is full causing the last 
| write to return with EAGAIN.

But then it is really tough for the kernel to determine when to send
sigio to the process. Assuming the size of the pipe buffer is 4096:

Scenario 1:
    write 4096 bytes returns 4096 and the buffer is full.
    read 1 byte.
    Does the process get SIGIO? Well, it did not ask to write more
    than the pipe buffer, and it might never want to write again so
    one can claim that sending SIGIO is superfluous in this case, but
    I think that it should.

Scenario 2:
    write 8192 bytes returns 4096 and the buffer is full.
    read 1 byte.
    Does the process get SIGIO? I think so. There is space in the buffer
    now to write more data.

Scenario 3:
    write 8192 bytes returns 4096 and the buffer is full.
    try writing the remaining 4096, returns -1 and errno = EAGAIN
    read 1 byte.
    Does the process get SIGIO? I think so. There is space in the buffer
    now to write more data.

Scenario 4:
    write 4 bytes, returns 4 and there is space in the buffer.
    read 1 byte.
    Does the process get SIGIO? The BSD sockets version thinks so.
    The other implementations don't.

Things get more complicated if you involve process groups instead
of single processes...

Consider the following variation to the program that Emmanuel posted.
Running it with 8192 [on linux], says:

% ./a.out 8192
written 4096 bytes
reading from the pipe
read 1 bytes
wrote 1 bytes
TEST SUCCESSFUL
%

Even if I duplicate the write so that it returns EAGAIN on the
second one [it is commented out right now] I still don't get a
signal on linux and solaris. Now, I expect to get a signal because
after the read, there is space in the pipe for me to write more
data. What am I missing?

christos

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <fcntl.h>

void io_sighandler (int sig) {
  printf ("pid=%d got sigio\n", getpid ());
  printf ("TEST FAILED\n");
  exit (-1);
}

int main(int argc, char** argv) {
    struct sigaction aio;
    int fdsync[2];
    char c, *buf;
    sigset_t set;
    size_t size = atoi(argv[1]);
    int err;

    sigemptyset(&set);
    sigaddset(&set,SIGIO);


    aio.sa_flags = SA_RESTART;
    aio.sa_handler = io_sighandler;
    sigemptyset(&aio.sa_mask);
    if (sigaction(SIGIO, &aio, 0) == -1) {
        printf("Error: Bad return value from sigaction call\n");
        exit(1);
    }

    if (pipe(fdsync) < 0) { /* fd for synchronization */
      printf("Error: bad pipe call\n");
      exit(1);
    }

    /* now set the pipe write end to be non-blocking async */
    fcntl(fdsync[1],F_SETFL, O_NONBLOCK | FASYNC);
    fcntl(fdsync[1],F_SETOWN, getpid());

    buf = malloc(size);
    memset(buf, 'A', size);

    err = write(fdsync[1], buf, size);
/*
    err = write(fdsync[1], buf, size);
*/
    if (err < 0) {
       printf("write() got err=%d\n", err);
    }
    printf ("written %d bytes\n", err);

    sleep (1);

    printf ("reading from the pipe\n");
    err = read(fdsync[0], &c, 1);
    printf ("read %d bytes\n", err);
    err = write(fdsync[1], &c, 1);
    printf ("wrote %d bytes\n", err);

    printf("TEST SUCCESSFUL\n");
    exit(0);
}