Subject: Re: race in select() ?
To: Charles M. Hannum <abuse@spamalicious.com>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 10/11/2003 16:51:23
On Thu, Oct 09, 2003 at 10:07:44PM +0000, Charles M. Hannum wrote:
> On Thursday 09 October 2003 09:55 pm, David Laight wrote:
> > > I have a different, somewhat wackier suggestion...
> > >
> > > Pass an actual timeval to select(), with a Really Large Value. In the
> > > SIGCHLD handler, when we restore a socket in the mask, set the timeout to
> > > 0. Handle EWOULDBLOCK by resetting the timeval to the Really Large Value
> > > and looping around again.
> >
> > That isn't the problem. select should (surely) fail EINTR?
> > The existing code can only work if the select fails.
>
> Which is exactly my point. Setting the timeval to 0 forces select(2) to
> return immediately.
>
> > The problems happen when the signal handler runs while the process
> > is in the system call shim.
>
> Why is that a problem? The system call is fetching the timeout inside the
> kernel. Once you're inside the kernel, it doesn't matter -- the signal would
> force it to return EINTR immediately. My hack handles all of the cases where
> we get the signal before we enter the system call.
>
> Yet another solution is to use a sig_atomic_t flag to signal that we need to
> recopy the mask. I.e. (using Manuel's variable names):
>
> volatile sig_atomic_t whoops;
> volatile fd_set readable;
>
> ...
> do {
> whoops = 0;
> readable = allsock_select = allsock;
> } while (whoops);
> ...
> select(..., &allsock_select, ...};
>
> ...
> FD_SET(&allsock, new_descriptor);
> FD_SET(&readable, new_descriptor);
> whoops = 1;
>
> This makes the copy atomic WRT the signal handler, without having to mask
> signals.
I like this. It's cleaner than the timeval hack.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 24 ans d'experience feront toujours la difference
--