tech-kern: Re: vm.bufmem_hiwater not honored (Re: failing to keep a process from swapping)

Subject: Re: vm.bufmem_hiwater not honored (Re: failing to keep a process from swapping)
To: Arto Selonen <arto+dated+1100549554.cc01157bb96378bb@selonen.org>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 11/15/2004 18:44:31

On Mon, Nov 15, 2004 at 10:11:31PM +0200, Arto Selonen wrote:
> 
> Well, page daemon is asking for something back, with the buf_drain() call
> after page scanning etc. However, in my case bufmem was >17,000 *pages*
> (I chose to talk in pages instead of bytes, to have smaller numbers)
> over bufmem_hiwater, and that buf_drain() call from page daemon only
> asks for 20-80 pages per page daemon invocation (freetarg-free). It'll
> take a while to get below hiwater mark that way. Of course, I could be wrong,
> there could be other ways to free buffer cache memory, etc.

I happen to believe that freetarg should be considerably higher on modern
large-memory systems.  Others may disagree; we seem to discuss it here
from time to time but come to no really good conclusion.

I am curious as to _how_ bufpages got to be so high.  Do you have, 
perhaps, a huge number of directories on a very-large-block filesystem?
Or are you accessing the buffer cache through a block device node (this
is a pretty bad idea)?

The buffer cache growth algorithm is _extremely_ conservative.  Once it
gets to bufmem_hiwater, it should _always_ recycle an existing buffer
rather than allocating a new one.  The algorithm is from an old lecture
of Kirk McKusick's: the probability of allocating a new buffer is
inversely related to the amount of space already used.  So aside from
a very very small overshoot due to the block size exceeding the page
size, I don't understand how your system got into the situation it is
in, at all.

And I would very much like to.

Thor