Subject: Re: LFS (was Thank you NetBSD)
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Thor Lancelot Simon <tls@rek.tjls.com>
List: tech-kern
Date: 02/23/2005 14:52:28
On Wed, Feb 23, 2005 at 11:25:38AM -0800, Jonathan Stone wrote:
> 
> Hmmm. but that won't't scale so well for those multi-terabyte
> filesystems we were hypothesizing, will it?

I guess that depends on how much metadata you're willing to keep in
RAM, and its size in comparison to the data.  Read "cache" for "incore"
and recall the very small size of the metadata caches used in the 4.4
kernel that Seltzer was measuring with, if you prefer (we still have
the problem that we effectively cache inode data twice: once at the
vnode layer, in the old namei cache, and once at the block device layer,
in the buffer cache.  we could do better, and thus cache more metadata.)

Seltzer is good enough to keep copies of Ousterhout's critiques on
her own web site, though some of the links between the pages are
broken.  Here is a short list:

http://www.eecs.harvard.edu/~margo/usenix.195/ouster_critique1.html
http://www.eecs.harvard.edu/~margo/usenix.195/rebuttal.html
http://www.eecs.harvard.edu/~margo/usenix.195/ouster_critique2.html

We do in fact keep the atime in the ifile, not the inode itself, as
Ousterhout recommends.  That was one major change from LFSv1 to LFSv2
in our code.

I cannot find the referenced second response of Ousterhout to
Seltzer's second attempt at rebuttal.  Rereading this argument now,
I sympathize with Ousterhout's frustration: the line of commentary
about the poor quality of the BSD-LFS implementation and the tendency
of that problem to obscure fundamental measurements of filesystem
performance that are at issue rings very true given our experience
with BSD-LFS in NetBSD.

The lfs.tar.gz offered on the page
http://www.eecs.harvard.edu/~margo/usenix.195/ is not the same one
Seltzer sent me in 1996 or thereabouts when I asked for a copy of
the code actually used to run the benchmarks.  I note that it
contains RCS files, so it may be possible to check out the code
actually benchmarked, though reproducing the results at this far
remove would be a task for a true masochist.

-- 
 Thor Lancelot Simon	                                      tls@rek.tjls.com

"The inconsistency is startling, though admittedly, if consistency is to be
 abandoned or transcended, there is no problem."		- Noam Chomsky