Subject: Re: Do some disk accesses miss the UVM?
To: David Laight <david@l8s.co.uk>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 01/24/2002 20:22:03
On Thu, Jan 24, 2002 at 06:43:43PM +0000, David Laight wrote:
> > 
> > Today's experiments have shown this is caused by exceeding the size of
> > the 'buffer cache' - ie increasing the buffer cache size makes it
> > possible to scan a larger directory tree without wearing the disk out.
> > 
> 
> Things seem to be somewhat worse, I've been grovelling (find . -name
> '*.[ch] | xargs grep) through the kernel sources.  du -s gives a size of
> 128Mb (including some files which won't be grepped), top reports that my
> system (x86 pc) has 180Mb of 'free' memory, but the searches rattle the
> disk.  If the UBC if working properly in ought to find all the pages it
> wants sat in memory from the previous run.
> 
> The buffer cache (currently the default 13Mb) is large enough that 'find
> . | wc' leaves all the data it wants cached,  however 'du -s .' has to
> access the disk.  I can't see how the latter requires access to any more
> data.
> 
> Anyone know what is really going on here?

"find . | wc" doesn't need to call stat() on non-directory files, it can
tell the file type from data stored in the directory entries.  thus this
will avoid reading all the inode data for non-directory files into the
buffer cache.  "du -s .", on the other hand, needs to stat() every file
since it needs to know the size of each, so this command will use more
cache memory.

-Chuck