Subject: Re: Limitations of current buffer cache on 32-bit ports
To: Thor Lancelot Simon <tls@rek.tjls.com>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 07/23/2002 22:34:45
On Wed, Jul 24, 2002 at 12:47:49AM -0400, Thor Lancelot Simon wrote:
> You can't have an FFS filesystem with a 32K blocksize and a 1K frag size.  
> The best you can do is a 4K frag size, if the disk geometry forces you to
> use 32K blocks.

can't you just lie to mkfs and use a better geometry?

even if you do have a 4k frag size, a 1-frag buffer could still
use virtual space optimally on an x86 (if the buffer cache were
efficient in its use of virtual space).  I don't the larger
block or fragment size is a problem in and of itself.


> The buffer cache statistics on nbanoncvs are pretty instructive here.  We
> reduced MAXBSIZE to 32K, but those buffers are running about 77% utilized.
> We know there's not much but directories in them; though I get confused when
> trying to read the code, the statistics certainly seem to show that, at
> present, we do in fact cache directories in full blocks (since otherwise, how
> would the vast majority of our 400MB of buffer cache actually be in use)?

the way I read the code, our FFS will create directory buffers that are
smaller than a block.  ufs_lookup() -> VOP_BLKATOFF() -> ffs_blkatoff()
-> bread().  the bsize passed to bread() is computed using blksize(),
which rounds the file size up to a fragment (if the file is small enough
to have fragments).

but just thinking about it, it has to allow that.  directories
can have fragments, and we can only read or write whole buffers,
so if we always used whole blocks for directory buffers, then we
wouldn't be able to write to just the fragment for the directory
that we wanted without clobbering whatever 

on the other hand, each buffer can contain only one block (or fragment)
from one file, so you can cache at most as many directories as you have
buffers, regardless of how big each buffer is.

I'm not sure how to interpret your observed statistics.
what do you mean by 77% utilized?


> Couldn't the caching of directory data be decoupled from the actual physical
> structure in which it lived on the disk?  That would seem to offer the most
> hope for efficient use of cache, to me, even in the presence of stupid
> filesystems.

I'm not sure what you mean by the caching being coupled to the physical
structure.  what behaviour are you referring to?

maybe you're thinking of using 8k (or 4k) buffers for everything,
even with a 32k block size?  that would allow for more efficient use
of virtual space at the cost of using more buffer headers to describe
the same amount of data.  that seems like an improvement over your
current situation, but it seems even better (and probably easier to
implement) to just allow different buffers to use different amounts
of virtual space.

-Chuck