Subject: is something wrong with either ccd(4) or isp4(4), maybe only on alpha?
To: NetBSD Kernel Technical Discussion List <tech-kern@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 10/25/2004 17:24:04
I've just upgraded my alpha with all new(-to-me) 9GB drives.  Since four
of the drives were now connected to a Qlogic 1020UW controller I wanted
to make use of CCD for a fast striped filesystem for my objdir/destdir
filesystem.

However there's something weird going on with read rates on the CCD
driver on my alpha.  On average in all the tests I've done the read rate
is about half the write rate (and never better than 2/3's but only
better than 1/2 when the read-rate is lower than expected) and thus it's
about 1/4 of what I expected it to be given my experience with CCD on
i386.

For example:

2x9 GB disks, without write cache, on alpha's isp0:
WRITE: 1744830464 bytes transferred in 179.828 secs (9702774 bytes/sec)
READ:  1749024768 bytes transferred in 296.009 secs (5908687 bytes/sec)

2x9 GB disks, with write cache, on alpha's isp0:
WRITE: 1946157056 bytes transferred in 106.866 secs (18211190 bytes/sec)
READ:  1950351360 bytes transferred in 322.141 secs ( 6054340 bytes/sec)


However on i386 a 2-drive CCD works much better and more as it is
expected to work:

2x9 GB disks, without write cache, on i386's ahc0:
WRITE: 1098907648 bytes transferred in 72.372 secs (15184154 bytes/sec)
READ:  1103101952 bytes transferred in 54.479 secs (20248204 bytes/sec)

2x9 GB disks, with write cache, on i386's ahc0:
WRITE: 1157627904 bytes transferred in 80.549 secs (14371722 bytes/sec)
READ:  1161822208 bytes transferred in 56.889 secs (20422616 bytes/sec)

This is writing and reading a file on a filesystem using /dev/zero and
/dev/null as source and sink with "dd bs=4m".  The disks are somewhat
different (DS-RZ1DD-VW containing Compaq BD009635C3 relabled Fujitsu
MAJ3091MC with 8MB cache on the alpha, and Seagate ST39103LC with
probably only 1MB cache and no more than 4MB cache on the i386, which
explains the significant gains of write-back caching on the alpha), and
the host adapter is an ahc(4) on the i386, and I've used the same
filesystem parameters, etc., and a CCD interleave of 64 sectors on both.

Note that with dd using bs=4m and with an interleave of 128 the isp(4)
driver kicks out few "adapter resource shortage" warnings on the console
initially when writing (but then they stop after about 150MB have been
written).  However an interleave of 256 results in an almost continuous
stream of warnings.  An almost continuous stream also results when a
single disk filesystem is written to directly, even with bs=2m, a
regular flow of warnings ensues -- writes have to be reduced to 8k to
eliminate the warnings.

First with 4m buffers, which cause a regular stream of "adapter resource
shortage" warnings:

WRITE: 1245708288 bytes transferred in 117.766 secs (10577826 bytes/sec)
READ:  1247805440 bytes transferred in 106.334 secs (11734773 bytes/sec)

Then with with 8k buffers, which are the largest power-of-2 buffer size
that does not cause "adapter resource shortage" warnings:

WRITE: 1841569792 bytes transferred in 126.973 secs (14503632 bytes/sec)
READ:  1841577984 bytes transferred in 162.716 secs (11317743 bytes/sec)

(those are both with full read/write caching enabled on the drive)


I've tried all the old tricks.  Odd interleave values (non-powers of 2)
really suck worse than the values I found ideal on i386 and sometimes
also cause even worse "adapter resource shortage" warnings (e.g. even at
ileave=63).  (I never did really believe any of that non-powers of 2
interleave stuff had any scientific basis, at least not for the current
version of CCD).

The point is I can never seem to get the CCD read performance even to
equal that of a single drive, let alone exceed it, on the alpha; while
on the i386 that achivement was trivial (at least for big bulk files
with dd -- interleave tuning was only necessary to get mixed file sizes
and simultaneous mixed accesses working optimally).

Despite this I'm still more inclined to believe the problem is with
interleaving in the ccd(4) driver on the alpha than it being a problem
in the isp(4) driver.  I say this because a single drive doesn't seem to
exhibit the same problem, nor does a plain concatenation of the four
drives with ileave=0.  However I'd be happy to hear differently.

Maybe the EEROM host adapter settings are not "optimal", however I don't
yet have a copy of the "eeromcfg.exe" program (I do know where to get it).

In any case I wish I did have a spare 2940UW to throw into the alpha in
place of the QLogic 1020UW for a more direct comparison (the i386 one is
actually a aic7880 on the motherboard so I can't swap it either....)


Note also a 4-drive CCD does no better.  Optimal write rates are
achieved with an interleave of 128 but that causes lots of "adapter
resource shortage" messages.  Regardless of interleave the read rate is
always way less than 1/2.

4x9 GB disks (read/write cache) on alpha's isp0 in ccd with interleave 64:
WRITE: 1971322880 bytes transferred in 107.481 secs (18341128 bytes/sec)
READ:  1975517184 bytes transferred in 333.314 secs ( 5926895 bytes/sec)

That really sucks!

-- 
						Greg A. Woods

+1 416 218-0098                  VE3TCP            RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>          Secrets of the Weird <woods@weird.com>