Subject: bin/18895: make -j pauses between jobs
To: None <>
From: None <>
List: netbsd-bugs
Date: 11/02/2002 10:22:42
>Number:         18895
>Category:       bin
>Synopsis:       make -j pauses between jobs
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 02 10:23:00 PST 2002
>Originator:     Andreas Gustafsson
>Release:        NetBSD 1.6I
Speaking for myself
System: NetBSD 1.6I NetBSD 1.6I (GUAVAMP) #1: Sun Oct 6 19:38:50 PDT 2002 i386
Architecture: i386
Machine: i386

When running -j N where N >= 2, the build sometimes pauses
for several seconds, leaving the CPU(s) idle.

I am seeing this behavior on a dual AMD Athlon 1800 with "-j 2", and
Julio Merino <> reported seeing similar behavior on a
uniprocessor with "-j 4".  See the discussion under the subject
"Strange behavior with -j 4" on current-users.

When this happens, the make process is sleeping in the poll() call in
/usr/src/usr.bin/make/job.c, and the poll() only returns when its
timeout expires after five seconds.

My theory as to the cause of this problem is that there is a race
condition in make where it fails to detect a job exiting if it happens
so quickly that the SIGCHLD gets delivered before it enters poll().
It is also possible that kern/17517 could have something to do with


Run " -j 2" on a fast machine.  Observe the pauses in the make
output, or run "vmstat 1" in a different windows and observe periods
of 100% idle.  Optionally, increase the value of POLL_MSEC in
/usr/src/usr.bin/make/job.h to 60000 or so first to make the pauses
even more noticeable.


Make the SIGCHLD signal handler write a message to a pipe which is
included in the poll(), and/or fix kern/17517.
 (-current as of Oct 20, 2002)