Subject: Re: buffer cache memory management revision
To: Paul Kranenburg <pk@cs.few.eur.nl>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 11/20/2003 07:45:53
On Nov 20, 2003, at 12:44 AM, Paul Kranenburg wrote:

> Therefore, I propose to revisit the age old memory management still
> employed by the buffer cache. In particular, I'd like to get rid of
> the MAXBSIZE reservation of virtual memory per buffer which is sparsely
> mapped by privately managed pool of physical pages. Currently, this
> scheme stresses MMU resources on some platforms like sun4 & sun4c.
> It also wastes a large amount kernel VM space on machines with lots of
> physical memory when the default buffer cache parameters are in use.

This is *excellent*.  Thor has been talking about doing something like 
this for a while, and even had a sample patch that used malloc().  (I 
objected to the use of malloc() because it would put undue pressure on 
kmem_map.)

One of Thor's motivations was the fact that, to cache metadata 
effectively in some configurations (cvs.netbsd.org, in particular), you 
need a LOT of *really small* buffers.  Your patch will help achieve 
that.  This is just terrific.

> Other things to consider: use the already existing `bufpool' for buffer
> allocation possibly inserting a poolcache on top of it and then drop
> the EMPTY & AGE queues..

Yah, I think that would be a great idea.  However, if we did that, I 
think we should put an upper limit on the number of buf's that can be 
allocated via the traditional bio interface, so that the traditional 
"limit the size of the buffer cache" semantics can be preserved (i.e. 
you don't want all of your memory and KVM being chewed up by in-core 
copies of directories).  However, it could be run-time tunable using a 
sysctl.

Also something to consider there... Since you're using kernel_map, you 
can't actually free pages back from an interrupt context.  Can any of 
the functions that free pages back actually be called from interrupt 
context?  I can't really remember, but perhaps we need some assertions 
in there to catch stuff like this.

         -- Jason R. Thorpe <thorpej@wasabisystems.com>