tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

SCHED_M2 livelock



dual-core i386, current, SCHED_M2

Running a build of pkgsrc packages, the system seemed to stop making
any further progress at one point, but was still (sluggishly)
responsive.

top(1) showed that all the cpu time was spinning between 4 gmake
processes, getting ~50% cpu time (in sys) each.  It looked like some
of the other processes they had spawned weren't getting any time to
make progress, and the gmakes were somehow busy-waiting on them.

renicing the gmakes down didn't seem to help.  pkill -STOP'ing them,
and then selectively kill -CONT'ing them, allowed further progress.

It looks to me like the scheduler is repeatedly picking "the other
gmake" as the next process to run that's waiting on something, and
never running the things they're actually waiting for.  

It may be exacerbated by something in the way gmake waits for children
or pipes (every so often, top show one of them in pipe rather than CPU
state).  Maybe there's even something else at play (from vmlocking?) 
that causes gmake to use so much cpu spinning -- but whatever that is,
there's clearly also a fair-scheduling problem as well.

I'll experiment further (say, with more or less MAKE_JOBS) to see how
the problem moves around.  With 3 of the 4 gmakes running and one
stopped, i have 3 makes taking 50% of a cpu, and the rest of the cpu
used by cc and others making some progress.

--
Dan.

Attachment: pgpkXutpGosY2.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index