Re: Severe netbsd-6 NFS server-side performance issues

At 9:46 Uhr -0700 04.06.2012, Brian Buhrow wrote:
>From the description, it sounds like the amr(4) driver is
>really getting wedged somewhere and this is what's causing yor problem.  The
>question is whether the driver is the problem, the firmware on the raid
>device or just the combination of the two.

You missed one question: Is the "amr0: bad status (not active; 0x040)"
cause, or effect of the wedging? As I said, I get the occasional

dumping to dev 19,1 offset 313501
dump 109 amr0: bad status (not active; 0x0416)
amr0: bad status (not active; 0x0412)

from another machine that has two RAID-1 pairs off a MegaRAID 320-1. The
amr(4) seems to get easily confused when the kernel is in a bad state.

>  The amr driver puts a number of
>transactions in flight between itself and the raid controller depending on
>what it thinks the raid controller can handle.  It sounds like what's
>happening is that, over time, the capacity of the raid controller is
>leaking away due  to  bugs in the firmware or some interaction between the
>firmware and the amr driver itself.

Funny though that even a DEBUG|DIAGNOSTIC kernel never came up with
anything but the lone message above.

[Encouraging words about porting the FreeBSD changes]

>  If nothing else, you may be able to enable some
>debugging that tells yu what's going wrong and gives you an idea of how to
>fix it.  I take it you have no test raid array to work on between
>maintenance windows?

I do actually have a second 320-4X controller, but no pci-x board to plug
it into. Plus, the card is full height pci, and all the machines but the
fileserver are 2 HE. So, no, I am not able to come up with a reasonable
similar test bed.


     The ASCII Ribbon Campaign                    Hauke Fath
()     No HTML/RTF in email            Institut für Nachrichtentechnik
/\     No Word docs in email                     TU Darmstadt
     Respect for open standards              Ruf +49-6151-16-3281

