Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Re: Re: problems with netbsd-10 and PERC controllers
hello. The information that a warm reset doesn't come up clean is useful information.
In looking at mfii.c, it looks like there are two possible sources of the problem. The first
is the one I've mentioned earlier, that somehow, interrupt handling gets mangled during
operations and interrupt stop getting received from the Perc controller.
The second is the Perc controller itself is getting into a weird state causing its firmware to
stop completing requests.
I'm not sure which source to look at first, so here are some suggestions.
1. Before the problem occurs, can you capture some dmesg output showing how the mfii devices
attach and what interupts they're using?
2. What does the output of vmstat -i look like when things are working?
3. Have yu brought up the Perc's RAID configuration menu to confirm the raid sets are healthy
and that you're not getting any disk errors which might be masked from NetBSD itself? I've
seen this sort of behavior when a disk is throwing errors; the Perc firmware is so busy dealing
with the problem disk it stops responding to the mfii(4) driver. Unfortunately, the NetBSD
driver isn't very good about reporting these kinds of errors; I'm not sure if it's a problem
with the mfii(4) driver or the firmware on the Perc itself.
Because the errors happen at random intervals after the machine boots, it's possible the issue
is a good old fashioned failing disk.
I do realize yu see the errors on two separate controllers, which is why I'm leaning
toward an interrupt issue, but it would be good to verify your disks are good.
Hope that helps.
-Brian
Home |
Main Index |
Thread Index |
Old Index