Subject: unintended consequences of ffs_dirpref()?
To: None <tech-kern@netbsd.org>
From: Bill Sommerfeld <sommerfeld@netbsd.org>
List: tech-kern
Date: 01/16/2001 08:00:02
For those of you who haven't followed the story so far over on
current-users, Herb Peyerl discovered really poor performance on a
large mirrored filesystem (two 45G disks mirrored using raidframe
raid1).
mkdir was really really slow:
> As far as reading the source, 3687+7io is 3687 read and 7 write.
The filesystem started off containing a large pile of mp3 files;
untarring pkgsrc was taking ~forever.
my analysis so far:
So, there's one thing which mkdir() does in ffs which is different
from other sorts of file creation... it tries to put the directory in
a different cylinder group from the one it's parent lives in.
I think what's going on here is that ffs_dirpref() may be screwing up
and always picking an initial cylinder group with few directories,
lots of free inodes.. but no free blocks.. so it winds up hunting all
over the disk for free blocks before it finds one for the directory.
I'm willing to bet that the extra level of indirection required for
mirroring is causing the "hunt" for free blocks to no longer fit into
the buffer cache.
So, the core of ffs_dirpref() in sys/ufs/ffs/ffs_alloc.c is:
for (cg = 0; cg < fs->fs_ncg; cg++)
if (fs->fs_cs(fs, cg).cs_ndir < minndir &&
fs->fs_cs(fs, cg).cs_nifree >= avgifree) {
mincg = cg;
minndir = fs->fs_cs(fs, cg).cs_ndir;
}
maybe it should be something more like:
for (cg = 0; cg < fs->fs_ncg; cg++)
if (fs->fs_cs(fs, cg).cs_ndir < minndir &&
fs->fs_cs(fs, cg).cs_nbfree > 0 &&
fs->fs_cs(fs, cg).cs_nifree >= avgifree) {
mincg = cg;
minndir = fs->fs_cs(fs, cg).cs_ndir;
}
.. but I must admit I'm not an expert on ffs guts..
Herb reports that this fix indeed causes his problem to disappear.
now, there's one possible bad consequence of this: when a filesystem
gets very full, it may only have fragments left and ffs_dirpref will
end up returning zero all the time.
Thoughts?
- Bill