Subject: Re: race in select() ?
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Charles M. Hannum <abuse@spamalicious.com>
List: tech-kern
Date: 10/09/2003 21:29:59
On Thursday 09 October 2003 08:56 pm, Manuel Bouyer wrote:
> On Thu, Oct 09, 2003 at 08:19:26PM +0000, Charles M. Hannum wrote:
> > This does not, in fact, solve the problem.  Nor does it quite work as you
> > describe.
> >
> > First of all, your "false positive" detection will cause the return from
> > select(2) to actually be ignored in the race condition case.  In this
> > case, all your patch is really doing is causing us to loop around again
> > and pick up the new soeckt.
>
> Yes, this is what I meant. We'll select() again, and handle the socket
> this time.
>
> > Secondly, there is a race condition in the creation of "allsock_select". 
> > If we read the part of "allsock" containing the file descriptor in
> > question, but were interrupted before writing it to allsock_select, then
> > we will lose the setting of the bit that was done in the signal handler.
>
> Ha, right.
> Well, I tried to avoid this but we can mask signals in this section.

Which would probably work, but is slower.  *shrug*

I have a different, somewhat wackier suggestion...

Pass an actual timeval to select(), with a Really Large Value.  In the SIGCHLD 
handler, when we restore a socket in the mask, set the timeout to 0.  Handle 
EWOULDBLOCK by resetting the timeval to the Really Large Value and looping 
around again.