Subject: Re: comparing raid-like filesystems
To: Antti Kantee , Jukka Marin <jmarin@pyy.jmp.fi>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: current-users
Date: 02/01/2003 08:05:23
On Sat, Feb 01, 2003 at 11:08:32AM +0200, Antti Kantee wrote:

 > > I don't know why, because when RAID is 100% busy, CPU load is almost
 > > zero and the disk load 25% or so.  It's like if RAIDframe had some
 > > usleep() calls in the code to make things go slower..
 > 
 > What's your stripe unit size? I also was suffering from inexplainable
 > slowness until I dropped the stripe unit size to 16 sectors. Before that
 > my RAID5 gave something like 5MB/s write speeds, now it's giving more
 > than 25MB/s.

Right, this is a deficiency in how we handle layered disk I/O.  Right
now we are limited to MAXPHYS (64k) for each "disk".  If this "disk"
is a RAID volume, then this 64k must be divided up among the underlying
components, so 64k / width.  This has two problems:

	* Since you can't write an entire stripe in one I/O, RAID5
	  has to perform extra I/O to update the parity.

	* You end up using small transfer to the underlying component
	  disks.

Reducing your stripe unit size will mitigate the former.  We need
infrastructure changes in the kernel to fix the latter.

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>