tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: A possible bug with non-blocking sockets and SIGIO



>> [...possible SIGIO issue...]

The analysis you give has some validity, I think.

I tried onc.c on 1.4T and 4.0.1 (though with 1.4T I had to hack on
CLOG()'s syntax slightly).  1.4T works correctly with or without
-DV_ASYNC; 4.0.1 works only without, same as you see on 5.0.2.  I tried
a 5.1 machine I have guest access to and see the same failure as 4.0.1.

onc.c does indeed have a race condition; given the interface design of
SIGIO, there is an unavoidable race between setting up SIGIO on a new
connection and data arriving.  Real servers using SIGIO need to take
care to check, after setting up SIGIO, for any data that may have
arrived before SIGIO was set up.  While onc.c does not do this, using
the given test method, this does not matter, because the delay
introduced by having a human initiate the test connection allows the
server plenty of time to establish SIGIO before the data arrives.

> The conclusion is not to use SIGIO.

If NetBSD no longer supports use of SIGIO, NetBSD should stop
pretending otherwise: NetBSD should remove the interfaces that appear
to support it (O_ASYNC, SIGIO, and the associated documentation), to
prevent this kind of confusion in the future.

If, on the other hand, NetBSD does still support SIGIO, this kind of
response is of negative utility.

I would much prefer to see the bug fixed.  I don't think Dmitry's fix
is right (see below), but I am convinced there is a real bug here -
unless, of course, NetBSD no longer supports SIGIO, which would be a
pretty crippling regression (and in which case this is still a bug; the
relevant interfaces and their documentation should be removed).

I'm not sure whether O_ASYNC is supposed to be inherited from the
accepting socket to the accepted socket.  If it is, the relevant
process group setting needs to be copied as well; if not, the flags
copied need to have, at least, O_ASYNC deleted.  Adding additional
logging, I find that 1.4T does not copy O_ASYNC from the parent socket
to the child, so I think the correct change is probably to return to
the historical behaviour.  This is why I think Dmitry's change is not
the best: it, loosely put, copies O_ASYNC from socket to socket as well
as from file descriptor to file descriptor, and I think it shouldn't be
copied.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index