Subject: Re: tuning hardware RAID arrays (was: Bad NFS performance)
To: Dominik Westner <westner@absurd.dnsalias.org>
From: Greg A. Woods <woods@weird.com>
List: tech-perform
Date: 05/10/2001 03:07:59
[ On Wednesday, May 9, 2001 at 20:55:30 (+0200), Manuel Bouyer wrote: ]
> Subject: Re: Bad NFS performance
>
> Maybe you should try to run more extensive tests (eventually by writing to
> the raw device with different block sizes ) ? Maybe the write buffer is
> disabled ?

I've found that the pkgsrc/benchmarks/postmark filesystem benchmark can
do a very good overall throughput test on a filesystem, regardless of
whether it's a RAID array, NFS mount, MFS mount, etc.

You do have to tune it to make sure you're really testing the
filesystem.  On my development system with 192MB of RAM, and with my
RAID arrays having either 24MB or 32MB of cache RAM right now, I've
found that the following parameters give them a good workout:

# printf 'set bias read 5\nset bias create 5\nset size 10240 20480\nset transactions 100000\nset subdirectories 10\nset number 10000\nshow\nrun\n' | postmark 

PostMark v1.13 : 5/18/00
pm>pm>pm>pm>pm>pm>pm>
Current configuration is:
Transactions: 100000
Files range between 10.00 kilobytes and 20.00 kilobytes in size
Random number generator seed is 42
The base number of files is 10000
Working directory: /mnt2
10 subdirectories will be used
Block sizes are: read=512 bytes, write=512 bytes
Biases are: read/append=5, create/delete=5
Using Unix buffered file I/O
Report format is verbose.
pm>

Those were the only parameters that seemed to keep the disk as busy as
possible (according to 'iostat' and/or 'systat vmstat'), while at the
same time mirroring the usage pattern I expect in production.  I tried
varying the read/write biases to more accurately match what I'll be
using the filesystems for, but postmark usually eventually just went
into an endless loop consuming CPU time, or sometimes crashed.


Just for fun here are my results from four runs done with those
parameters on the RAID array with 32MB cache and with varying stripe
size and with turning the array's read-cache on and off.  The RAID array
probes as:

ahc0 at pci1 dev 4 function 0
ahc0: interrupting at irq 14
ahc0: aic7880 Wide Channel A, SCSI Id=7, 16/255 SCBs
scsibus0 at ahc0 channel 0: 16 targets, 8 luns per target
[[....]]
ahc0: target 14 using 16bit transfers
ahc0: target 14 synchronous at 10.0MHz, offset = 0x8
ahc0: target 14 using tagged queuing
sd4 at scsibus0 target 14 lun 0: <CMD TECH, CRD-5500, C1-A> SCSI2 0/direct fixed
sd4: 12285 MB, 24570 cyl, 16 head, 64 sec, 512 bytes/sect x 25159680 sectors

It currently contains 5 8-bit disk channels with one Seagate ST15230N
drive each; one as spare, the other four configured as a single RAID
Level 5 logical disk (which we see above as sd4).

I finally copied my /cvs and /work filesystems onto these arrays today.
Part of the CVS stuff came via NFS from my aging Sparc-2.  It was fun
watching the array use its cache.  The host bus activity light would
flicker almost constantly as samll writes were done, but the disk lights
would only blast full on for short periods of time every little while
and otherwise remain inactive.

OK, here are the postmark results:

with 32 sector chunk-size, read-ahead on:

Time:
        3480 seconds total
        3341 seconds of transactions (29 per second)

Files:
        59965 created (17 per second)
                Creation alone: 10000 files (95 per second)
                Mixed with transactions: 49965 files (14 per second)
        49866 read (14 per second)
        49830 appended (14 per second)
        59965 deleted (17 per second)
                Deletion alone: 9930 files (292 per second)
                Mixed with transactions: 50035 files (14 per second)

Data:
        807.52 megabytes read (232.05 kilobytes per second)
        974.76 megabytes written (280.10 kilobytes per second)


with 32 sector chunk-size, read-ahead turned off:

Time:
        3526 seconds total
        3363 seconds of transactions (29 per second)

Files:
        59965 created (17 per second)
                Creation alone: 10000 files (80 per second)
                Mixed with transactions: 49965 files (14 per second)
        49866 read (14 per second)
        49830 appended (14 per second)
        59965 deleted (17 per second)
                Deletion alone: 9930 files (254 per second)
                Mixed with transactions: 50035 files (14 per second)

Data:
        807.52 megabytes read (229.02 kilobytes per second)
        974.76 megabytes written (276.45 kilobytes per second)


with 64 sector chunk-size, read-ahead turned on:

Time:
        3609 seconds total
        3490 seconds of transactions (28 per second)

Files:
        59965 created (16 per second)
                Creation alone: 10000 files (107 per second)
                Mixed with transactions: 49965 files (14 per second)
        49866 read (14 per second)
        49830 appended (14 per second)
        59965 deleted (16 per second)
                Deletion alone: 9930 files (381 per second)
                Mixed with transactions: 50035 files (14 per second)

Data:
        807.52 megabytes read (223.75 kilobytes per second)
        974.76 megabytes written (270.09 kilobytes per second)

with 64 sector chunk-size, read-ahead turned off:

Time:
        3506 seconds total
        3357 seconds of transactions (29 per second)

Files:
        59965 created (17 per second)
                Creation alone: 10000 files (84 per second)
                Mixed with transactions: 49965 files (14 per second)
        49866 read (14 per second)
        49830 appended (14 per second)
        59965 deleted (17 per second)
                Deletion alone: 9930 files (320 per second)
                Mixed with transactions: 50035 files (14 per second)

Data:
        807.52 megabytes read (230.33 kilobytes per second)
        974.76 megabytes written (278.03 kilobytes per second)



For comparison purposes here's a run on a single spindle UltraFAST &
WIDE disk on the same Adaptec controller:

ahc0: target 4 using 16bit transfers
ahc0: target 4 synchronous at 20.0MHz, offset = 0x8
ahc0: target 4 using tagged queuing
sd3 at scsibus0 target 4 lun 0: <FUJITSU, MAB3045SC, 0109> SCSI2 0/direct fixed
sd3: 4343 MB, 8491 cyl, 5 head, 209 sec, 512 bytes/sect x 8895370 sectors

Time:
        6488 seconds total
        5838 seconds of transactions (17 per second)

Files:
        59965 created (9 per second)
                Creation alone: 10000 files (30 per second)
                Mixed with transactions: 49965 files (8 per second)
        49866 read (8 per second)
        49830 appended (8 per second)
        59965 deleted (9 per second)
                Deletion alone: 9930 files (30 per second)
                Mixed with transactions: 50035 files (8 per second)

Data:
        807.52 megabytes read (124.46 kilobytes per second)
        974.76 megabytes written (150.24 kilobytes per second)




-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>