Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RAIDframe performance (RAID-5)



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 28 Aug 2008, at 21:09, Greg Oster wrote:

I'm using 16384 and 2048 (it's the raid0g partition below that I'm
testing with).

- -bash-3.2# disklabel raid0 | grep Cyl
a:   1048576        63     4.2BSD      0     0     0  # (Cyl.
0*-   1024*)
b:   4194304   1048639       swap                     # (Cyl.
1024*-   5120*)
d: 327679744         0     unused      0     0        # (Cyl.      0
- - 319999*)
e:  16777216   5242943     4.2BSD      0     0     0  # (Cyl.
5120*-  21504*)
f:   4194304  22020159     4.2BSD      0     0     0  # (Cyl.
21504*-  25600*)
g: 301465281  26214463     4.2BSD   2048 16384     0  # (Cyl.
                ^^^^^^^^

That offset is not a multiple of your stripe size (128 blocks)...

Umm.

That means that all filesystem blocks won't be stripe-aligned, and
that's going to REALLY hurt you for write performance.. (you'll
almost always be doing "small writes", as even a 64K write will get
split over two different stripes :( )

This is somewhat old now, but here is a benchmark I did a few years
back with w/ 64K block, 8K frag on a RAIDframe RAID 5 set of 5 disks:

   -------Sequential Output-------- ---Sequential Input-- --Random--
   -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU  /sec %CPU
1500 86487 76.4 108263 70.5 10128 8.0 90153 74.4 141263 44.4 356.7 4.3

I don't have a copy of the disklabel, but I suspect the filesystem
began at block 0 of the RAID set...  But it is possible to get good
write speeds on RAID 5 with a bit of tuning :)

Ok, assuming that I basically always partition my disks (and RAID sets) the same way for convenience and that this means that I will start at offset 63 for reasons of PC MBRs and bootability.

Then I read what you're saying as "add ($stripsize - 63) extra blocks at the end of your 'a' partition before all the others". That way I ought to get all the remaining partitions stripe-aligned.

Will try that ASAP.

With this disklabel fragment (and 26212528 is a multiple of the stripe size of 128), I get:

- -bash-3.2# disklabel raid0 | grep g:
g: 301465216 26214528 4.2BSD 2048 16384 0 # (Cyl. 25600*- 319999*)

Version 1.03 ------Sequential Output------ --Sequential Input- - --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- - --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP pear 300M 11804 5 11790 4 9011 2 91340 64 105476 18 548.5 2

Better. But not much. Now with a blocksize of 64K and a frag size of 8K:

Version 1.03 ------Sequential Output------ --Sequential Input- - --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- - --Seeks-- Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP pear 300M 16012 7 15333 5 12643 4 104565 74 106378 15 443.4 2

Again, an improvement, but not really good.

Just to understand this better: you're arguing that the partition should be aligned on a stripe size boundary with the beginning of the RAID set. But the RAID set itself is (in my case) on an 63 block offset from the 128 block alignment with the beginning of the physical disk:

#        size    offset     fstype [fsize bsize cpg/sgs]
a: 1048576 63 4.2BSD 0 0 0 # (Cyl. 0*- 1040*) b: 1048576 1048639 swap # (Cyl. 1040*- 2080*) c: 312581745 63 unused 0 0 # (Cyl. 0*- 310100) d: 312581808 0 unused 0 0 # (Cyl. 0 - - 310100) e: 16777216 2097215 4.2BSD 0 0 0 # (Cyl. 2080*- 18724*) f: 4194304 18874431 4.2BSD 0 0 0 # (Cyl. 18724*- 22885*) g: 163840000 23068735 RAID # (Cyl. 22885*- 185425*) h: 2097152 186908735 4.2BSD 0 0 0 # (Cyl. 185425*- 187505*) i: 123575856 189005952 RAID # (Cyl. 187505*- 310100)

Isn't the right criteria that the partition (raid0g in my case) should be aligned with the stripe size relative to the physical beginning of the actual disk, rather than the RAID set?

I.e. in my case with two 63 block offsets the raid0g partition should be nudged 2 blocks towards the end of the disk to achive "alignment". But this is just me guessing.

So I did that (actually I moved all the wdNg partitions that are the components of the RAID set), but as it will take about 80 minutes to recompute the partity for the new RAID set I will not get any numbers for that theory until tomorrow.

Johan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFItxOfKJmr+nqSTbYRAoLlAKCMtKu/8VTSLmdd3rr6CM2GPze+lwCfaGzu
tAF5tNqjeE7RCE2VyCkIOHs=
=9JLP
-----END PGP SIGNATURE-----


Home | Main Index | Thread Index | Old Index