Subject: Re: Strange behavior with build.sh -j 4
To: David Laight <david@l8s.co.uk>
From: Andreas Gustafsson <gson@gson.org>
List: current-users
Date: 11/02/2002 09:32:12
David Laight writes:
> > My current theory is that it is a race condition in make.  Make relies
> > on a SIGCHLD to wake up select() when a job exits, but what if the job
> > exits *before* the parent make has entered select()?
> 
> If it is doing that, then it is truly horrid :-)
> 
> Having a SIGCHLD signal handler write a message to a pipe
> with in included in the select would fix it...

Right, that's the solution I was thinking of, too.

> Alternatively you need to implement something that makes signals
> only happen when they actually abort a kernel sleep.
> (Actually 'useful' to fix a few other lurking buglets I've
> seen in the past.)

It might be possible to use kqueue to notify make when children exit
(the CVS log for job.c in FreeBSD's make mentions that it has has some
code for this but that it's "disabled because it panics the kernel"),
but we should still fix the select/poll case for the benefit of cross
builds on platforms that don't have kqueue.

There also the possibility that kern/17517 is involved after all. When
I first read the code I thought the select() only involved the job
output pipes (which are only selected on by a single process) but on a
more careful reading I found that it can also involve the "job token"
pipe which is shared among multiple make processes when recursive
makes are involved.  Thus, there could in fact be three or more make
processes selecting on the same pipe.

Since no one else seems to have submitted a PR on this yet, I will.
-- 
Andreas Gustafsson, gson@gson.org