netbsd-users: Re: raidframe consumes cpu like a terminally addicted

Subject: Re: raidframe consumes cpu like a terminally addicted
To: Matthias Buelow <mkb@mukappabeta.de>
From: Robert Elz <kre@munnari.OZ.AU>
List: netbsd-users
Date: 04/30/2001 20:23:19

    Date:        Mon, 30 Apr 2001 00:46:02 +0200
    From:        Matthias Buelow <mkb@mukappabeta.de>
    Message-ID:  <20010430004602.A1934@altair.mayn.de>

Greg, I'm not sure if you read netbsd-users, so I have added you
explicitly, I think you are the one person who knows enough about
raidframe to answer this quickly...

  | and that's the label of the raid5 set on top of them:
  | 
  | # /dev/rraid1d:

  | bytes/sector: 512
  | sectors/track: 256
  | tracks/cylinder: 1
  | sectors/cylinder: 256
  | cylinders: 139970

Yes, that's certainly not a good layout.   If you don't have the kernel
patch mentioned on the list, then allocating new blocks is (sometimes)
likely to be very slow (CPU intensive).

You'll also most likely find that you're wasting more filesys space in
overheads than you would really like - I'll bet that when you newfs'd
that filesys it was printing "alternate superblock numbers" until you
never thought they would stop...   That's because this layout will
cause way too many cylinder groups (with all their headers, etc) with
way too few blocks in them.

But I doubt this is the cause of your ls -l problems.

  | During the test (the ls -l on a directory with ~1000 entries) I have
  | looked on disk i/o with systat iostat and there was almost nothing
  | (most of it was in the buffer cache anyways) so insufficient bandwidth
  | shouldn't be the problem in that case, imho.

I wonder if perhaps raid5 is requiring that the parity be recomputed
(checked) every time the blocks are accessed.   Most likely the 1000
entries will be overflowing the vnode cache, meaning each stat() will
require the inode to be fetched from the buffer cache (and of course,
certain the first time they're referenced).   1000 inodes from one directory
isn't much of a buffer cache load though, so quite likely all the inode
blocks are in memory - hence little or no actual I/O.

But if doing a read from the buffer cache requires the raid code to
validate the raid5 parity each time, then there's likely to be a lot of
CPU overhead involved there (1000 times 3 times 2K word reads (ie, 6M RAM
accesses), plus the computations).

However, I am speculating, Greg will know if anything like that might
possibly be causing high CPU usage while doing an "ls -l" on a raid5
directory containing lots of files (which doesn't happen doing a similar
access on a non-raid filesys).

kre