Subject: Re: raidframe consumes cpu like a terminally addicted
To: Greg Oster <oster@cs.usask.ca>
From: Matthias Buelow <mkb@mukappabeta.de>
List: netbsd-users
Date: 05/05/2001 05:06:46
Greg Oster writes:

>> >>   | cylinders: 139970

>Basically pick a tracks/cylinder value larger than 1 (8 or larger is probably 
>fine.)  Even 32/8/133970 is probably better than the current 256/1/133970.
>64/32/16746 might be even more reasonable.

>> >> # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level
>> >> 32 1 1 5
>> >That's only 16K per component in this case, which is probably too small 
>> >for the best performance... 32K per component may perform better.

I have improved all of the things mentioned (bumped up sectPerSU to 64,
changed the geometry to 64/32/17496 (yes, you made a quoting mistake,
Greg :)), and rebuild the filesystem.

Bonnie's results were ok, so I generated a directory with 1000 subdirs
and got a whopping 0.5s wall clock time for the "ls -l" instead of the
previously observed 30s for a directory with 833 subdirs.

However... after I untarred the pop3 spool (maybe it is already dawning
on you...), I rerun the "time ls -l" test on the particular directory
and times were basically the same 30s as with the old layout.

It also dawned on me, of course, after a while.  I am not completely
stupid.

The directory hosts pop3 spool dirs.
...
Each directory has a different owner.
...!
Resolution of uids vs. symbolic names is done via NIS.
...!!
ls -l displays symbolic names
...!!!

==> The delay has never been RAIDframe's fault (afaik) but it's
simply that ls has to do 833 yp rpc queries to get the fscking
user names!  I feel highly embarrassed!

Of course the old layout was rather deficient aswell, and I'm quite
happy of this optimizing side effect of my originally false assumption
(in course of this discussion.)  The RAID is measurably performing
better now with the new values and I'm glad that the bug is already
fixed in -current.  I have to apologize, though, that the pessimizing
behaviour I assumed to be RAIDframe's fault in fact has actually been
the RPC network delays of doing YP lookups.

So, to sum it up, next time someone moans "ls -l in a dir with ~1000
entries takes 30s to complete!" the first thing you ought to ask is,
are in there files with each a different owner, and are the user names
served via NIS?

Thanks again. :)

--mkb