current-users: Re: nfs tuninng with raid

Subject: Re: nfs tuninng with raid
To: None <current-users@netbsd.org>
From: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
List: current-users
Date: 07/13/2001 03:49:19
	Hello folks.  I wanted to follow up on this problem so others might 
learn from my mistakes. :)
	It turns out that much of our recent performance problem was due to a bad
IDE cable on one of the IDE busses.  It was generating so many CRC retries
that no one could get a good write going.  We didn't notice because we had
been experiencing very frequent CRC errors on the same bus for months, and
assumed that the error rate hadn't changed when, in fact, swapping the
cable increased the problem by a couple orders of magnetude.  Looking back
at previous performance, I realized we were regularly pushing 5-7 Mbits/sec
in writes to the array, as opposed to the reported 600Kbits/sec.  in
swapping in a new IDE cable, again, we've returned to the previous
performance bench mark, with, incidentally, the CRC errors returning to the
original drive on the bus.  Thus, I now suspect that there is something
weird about this drive, and, if we can swap it out, we'll get even better
performance.  I really don't know what the theoretical maximum data
transfer rate we can get here is, or even the realistic maximum, as I'm not
sure how close we are to saturating the PCI bus controller.  I've probably
shot myself in the foot by configuring the array to use consecutive drives
on consecutive busses for its stripes.  Perhaps if I had alternated busses
and controllers, I would see even better performance.
	Greg points out that I might have achieved better stripe writing 
performance, by using a stripe size of 32 or, something large enough to
allow me to write entire stripes to one spindle.  Manuel pointed out at the
beginning of this process that there was a penalty to be paid for small
stripe widths due to the DMA programming overhead on the PCI IDE
controllers.  Using a stripe width of 1008, as opposed to 63, which is what
I ended up using, caused the PCI bus to wedge hard very quickly.
Consequently, while I don't consider myself an expert on these matters, it
seemed like I had to choose my stripe width based on a number of conflicts
and come up with a working compromise.
	By the way, what is the formula for figuring out exactly how to 
stripe your disks so that writes only write to one disk at a time?

-thanks
-Brian

On Jul 9, 11:20pm, Brian Buhrow wrote:
} Subject: nfs tuninng with raid
} 	Hello folks.  I've been trying to increase the performance of the 
} box I'm using as a large RAID NFS server and have a few questions.  
} I seem to be able to serve up about 160Kbytes/sec to about 8 clients
} simultaneously for reading, and about 50Kbytes/sec for writing.  I've tried
} increasing the numver of nfsd's running, from 4 to 12, and the number of
} kern.nfs.iothreads from 1 to 12.  This made things much worse.  Knocking
} the number of iothreads down to 4, while leaving the number of nfsd's
} running make things better, but still not very fast, it seems.
} 	Running ps -lpid on the various nfsd processes shows that they're 
} spending a lot of time waiting on vnlock or uvn_fp2.  I tried increasing
} the number of kern.maxvnodes to 50,000 from 6,700, but this seems to have
} little to no effect.
} 	Any rules of thumb on how many iothreads 
} for NFS are optimal, versus the number of nfsd's running?  Are there rules
} of thumb on how to tune vnodes, and other parameters to help streamline the
} system?  This is running in an I386 box with 1.5R kernel and 1.5 user land
} programs.  The machine has a raid 5 array of 15  75GB IDE disks on it.
} It's using an Intel on-board 10/100MBPS ethernet adapter with the fxp
} driver in 100-MBPS/full duplex operation.
} Any suggestions/guides/things to look at would be greatly appreciated.
} -Brian
} 
} iothreads
>-- End of excerpt from Brian Buhrow