Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: non-automated test failure report! :)

On Tue, Nov 15, 2011 at 05:51:20PM +0000, Eduardo Horvath wrote:
 > A while back I looked into preventing overcommit by tracking ovarall 
 > address space allocation and comparing it to total swap space.  This would 
 > allow the kernel to return errors through the system call interface 
 > instead of just killing off processes.  However, page loaning made the 
 > accounting extremely difficult and I was unable to design something that 
 > could keep an accurate account of address space allocations.

IMO, it's not worth the trouble. Pessimistic allocation of swap tends
to require heroic amounts of swap for even fairly modest workloads.
With the sizes and prices of disks being what they are, this is no
longer a total non-starter, but if you ever actually ended up using
that much swap you'd grind to a halt thrashing anyhow.

Now, we *could* use a better OOM-killer...

I wonder if machine learning techniques could model the predicted swap
requirements of processes well enough to be able to identify when one
goes berserk. Or for cases where the problem isn't one or two things
going crazy but too many total things, maybe even to be able to warn
about or deny running the job that's likely to put the system over.

David A. Holland

Home | Main Index | Thread Index | Old Index