Subject: Re: Low AAC performance but only when tested through the file system
To: Olaf Seibert <rhialto@polderland.nl>
From: Jason Thorpe <thorpej@wasabisystems.com>
List: port-i386
Date: 12/02/2003 09:06:20
--Apple-Mail-46--264215817
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; format=flowed


On Dec 2, 2003, at 7:45 AM, Olaf Seibert wrote:

> Isn't that simply always the case? I'm not sure how write size and
> stripe size influence this unless the RAID controller is exceedingly
> stupid and re-creading the parity for the whole stripe if only a single
> sector of it changes. And I've got the write cache turned on so it
> should delay writing a bit until it has an optimal quantity to write
> anyway.

To review: New Parity = Old Parity ^ Old Data ^ New Data.  This holds 
for any block (and its corresponding parity block) within a stripe 
chunk.

That is the general case.  If the entire stripe happens to be in the 
stripe cache, then obviously it can avoid the read, but for each block 
within the I/O, it must do this computation.

However, if the block starts on a stripe boundary and the I/O size is 
an even multiple of the stripe size, then for each stripe contained in 
the I/O, you can compute parity like this: New Parity = New Data0 ^ New 
Data1 ... ^ New DataN.  Because you don't have to do this 
block-by-block, but instead can do it chunk-by-chunk, the XOR engines 
in the IOPs on the cards can work more efficiently.

> Of course bonnie++ does it tests through the file system so even it
> sequential writes are not necessarily sequential on the disk. But since
> the partition was almost empty, most of it should really be sequential.

Sure, it might be sequential, but the timing of the writes being issued 
can make a difference, also (i.e. some cards will stall a write of a 
chunk within a stripe until the rest of the stripe is written, bounded 
by a timer).

Also remember that a sequential write of your data through the file 
system is causing bitmaps, etc. to be updated as you write, and those 
may cause r/m/w cycles at the RAID controller, as well.

I suggest you try a test with LFS to see if that changes anything.

> I plan to do so, indeed. I am sure there will be a performance
> difference. But I can't believe it would make a difference between 4 
> and
> (say) 40 M/s. There must be more to it than that.

I can believe it.  The fact that you can get 40MB/s using plain RawIO 
tells me there is nothing wrong with the card.  It seems obvious that 
the problem is the I/O pattern that the file system is generating.

         -- Jason R. Thorpe <thorpej@wasabisystems.com>


--Apple-Mail-46--264215817
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQE/zMaNOpVKkaBm8XkRAjRUAJ9Hm9Zbk0PJKUAmC4UnwuoiyP3GKgCcDddw
ADrdjtzFrzJINpOpgntuj1k=
=xvSe
-----END PGP SIGNATURE-----

--Apple-Mail-46--264215817--