Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Debugging kernel memory allocation

Rich Neswold <> writes:

> So, right now, I've only seen the stalls on my two 8-core machines. I
> haven't been able to stall the 4-core machine.

Maybe you're not loading the 4-core one down enough.  In my experience,
the hangs will only happen after the system has worked its way down to
only having a couple of MB free RAM (as shown by 'top', for instance).
Then, it seems that "the right things happening at the right time" will
cause a memory exhaustion and hang.  After such events, 'vmstat -s'
shows that the page daemon thread has been active.  It also shows this
at the moment:

       29 faults with no memory
        0 faults with no anons
        0 faults had to wait on pages
        0 faults found released page
    12237 faults relock (12228 ok)

So since the last boot, I've had 29 occurrences of page faults with no
RAM available to resolve them.  This seems unnecessary on an 8GB machine
that uses 6GB for the file cache...

I may be reading too much into what I'm observing, and there may be
coincidences involved (after all, I have rather few data points yet),
but tomorrow I'll be physically back with the machine, and I'll be
trying to modify the kernel to actively keep a bit more RAM free, so as
to lower the chance of falling into this hole.

It doesn't matter how beautiful your theory is, it doesn't matter how smart
you are. If it doesn't agree with experiment, it's wrong.  -Richard Feynman

Home | Main Index | Thread Index | Old Index