Subject: Re: race in select() ?
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Charles M. Hannum <abuse@spamalicious.com>
List: tech-kern
Date: 10/09/2003 20:19:26
On Thursday 09 October 2003 07:43 pm, Manuel Bouyer wrote:
> On Thu, Oct 09, 2003 at 06:01:53PM +0100, David Laight wrote:
> > On Thu, Oct 09, 2003 at 05:20:10PM +0200, Manuel Bouyer wrote:
> > > On Thu, Oct 09, 2003 at 02:13:51PM +0100, David Laight wrote:
> > > > I presume inetd takes the fd out of its select list until the
> > > > rpc.rstatd process exits.  Otherwise there would be a nasty loop.
> > >
> > > Yes it does.
> >
> > There is a timing bug in inetd itself.  rev 1.76 reads:
> >
> > 	readable = allsock;
> > 	if ((n = select(maxsock + 1, &readable, (fd_set *)0,
> > 		...
> >
> > and reapchild does:
> > 	FD_SET(sep->se_fd, &allsock);
> >
> > So if a child exits after the copy is made, but before the system call
> > the select won't be looking for the correct fdset.
> > If reapchild set the bit in readable then there could be a false
> > positive if the SIGCHLD happened in the system call return path,
> > that might be easier to check though (eg detect it happening, and
> > call select again).
> >
> > Of course, you may not be hitting this one...
>
> Yes, I'm almost sur this is it. I may have got it off by one when I checked
> the fd_sets values in the kernel (the rpc.statd socket is the last one).
> I'll try to check this when the bug shows up again (it happens once in a
> few days, but I couldn't reproduce it at will).
>
> About this race: I think the attached patch should fix it.
> To avoid the race before select, the sigchld handler adds the bit in
> allsock and the fd_set we will select on (allsock_select).

This does not, in fact, solve the problem.  Nor does it quite work as you 
describe.

First of all, your "false positive" detection will cause the return from 
select(2) to actually be ignored in the race condition case.  In this case, 
all your patch is really doing is causing us to loop around again and pick up 
the new soeckt.

Secondly, there is a race condition in the creation of "allsock_select".  If 
we read the part of "allsock" containing the file descriptor in question, but 
were interrupted before writing it to allsock_select, then we will lose the 
setting of the bit that was done in the signal handler.