Subject: Re: disk caching
To: Eduardo Horvath <eeh@turbolinux.com>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 04/12/2000 23:09:01
the two issues here are (1) selective reclaimation of pages based on
what pages are currently begin used for, and (2) flush-behind or the
lack thereof.  the former is handled by what sun calls "priority paging",
and I'm going to do something similar for netbsd.  the latter is
addressed by just implementing flush-behind, which I'm also going to do.
don't worry, performance under load is a major design concern.

you're righ that the chs-ubc2 branch is pretty stale at this point.
it doesn't do softdep yet and I need to plan the renaming game
because the softdep code used the name VOP_BALLOC() for something
other than what I used it for.  lately I've been merging all the
not-really-UBC-related stuff from the branch to -current so that
there's less for me to maintain in the branch, and once I'm all the
way done with the usenix paper I'll do the softdep stuff so that
people might be willing to try it again.  the plan at this point
is to make a chs-ubc3 but only branching the files that I change
instead of the whole kernel.  that way it'll be much easier for me
to keep the branch in sync.

-Chuck


On Wed, Apr 12, 2000 at 09:54:21AM -0700, Eduardo Horvath wrote:
> On Wed, 12 Apr 2000, Frank van der Linden wrote:
> 
> > On Wed, Apr 12, 2000 at 11:01:17AM -0400, Mirian Crzig Lennox wrote:
> > > So I've noticed that linux seems to be able to fudge a heck of a lot
> > > of disk io performance from just throwing huge amounts of cache at the
> > > problem.  Although inelegant, this approach really does seem to move
> > > disk io right along.  I'm wondering if the same effect can be duplicated
> > > in NetBSD, or if NetBSD has an even better approach to the problem.
> >
> > Linux' buffer cache grows and shrinks with the availability of memory,
> > i.e. it's dynamically sized.
> > 
> > NetBSD's buffer cache currently is still fixed size, however, post-1.5,
> > Chuck Silvers' UBC work will be brought into the tree, which makes
> > our buffer cache also use more memory if it's available. And it
> > solves the "mmap() doesn't see what write() does and vice versa" problem
> > (hence it's name, Unified Buffer Cache).
> 
> I really hope I don't end up triggering a flame war with this.
> 
> I have recently had the opportunity to run both NetBSD and Linux under
> heavy load on the same machine (no, not at the same time.)  The
> interactive performance of NetBSD under those conditions is undenyably
> superior.  This experience is making me very uncomfortable with the future
> move to UBC.
> 
> The box is a 500MHz PIII with 128MB RAM and an IDE disk.  What I did was
> fire up a parallel compile to the point where the CPU idle time goes to
> 0%, while using X applications on the console.  Under NetBSD you can
> bearly notice the machine is under load.  Under Linux the machine becomes
> unbearably sluggish.  
> 
> I believe what happens under Linux is that since the buffer cache is
> competing with the VM system, application code and data pages are being
> reclaimed by the buffer cache and swapped out.  Then when you try to move
> the mouse or hit a key in your terminal window, the appropriate pages need
> to be paged back in.  Since we're talking about interactive applications
> here, the critical code paths are only being traversed when there's I/O
> going on, which tends to be sporadic, thus there is a tendency for these
> critical pages to be reclaimed.
> 
> As I've stated before, I think we need some way to limit the growth of the
> buffer cache before it starts replacing application data.  There are a
> number of pathalogical case with UBC systems where linear disk traversal
> (say using `find / -exec grep .....') or generating and writing large
> amounts of data tend to replace all application pages with cached disk
> pages that are unlikely to ever be re-referenced.
> 
> There's another issue with the way Linux seems to handle the buffer
> cache.  It seems to like to delay writes until the last possible moment,
> which makes the machine seem to run really fast until it needs to flush
> everything to disk at which point the machine siezes up.
> 
> I suppose I'll have to grab hold of UBC and do some performance testing
> one of these days.  But the UBC branch looks a bit ancient and doesn't
> seem to handle softdep yet.
> 
> > The static size of the buffer cache can currently be controlled
> > by the "BUFCACHE" option, which is a percentage of available memory.
> 
> I believe there are also some issues with the amount of kernel address
> space available for use in the buffer cache.  
> 
> Eduardo Horvath				   
>