current-users: Detecting failures with logical disks

Subject: Detecting failures with logical disks
To: None <current-users@netbsd.org>
From: Martti Kuparinen <martti.kuparinen@iki.fi>
List: current-users
Date: 11/18/2005 15:58:15

Hi!

One of our servers (NetBSD/i386 3.0_RC1) had disk failure and after reboot I
noticed the "degraded" keywork with ld0:


amr0 at pci8 dev 8 function 0: AMI RAID <PERC 4/Di>
amr0: interrupting at irq 11
amr0: firmware 251S, BIOS 1.07, 128MB RAM
ld0 at amr0 unit 0: RAID 1, degraded
ld0: 136 GB, 17834 cyl, 255 head, 63 sec, 512 bytes/sect x 286511104 sectors


Any ideas how to get the RAID controller to display when errors happen?
I'm willing to hack and test the (amr and aac) drivers if I just get something
to start with. Currently (at least) the amr driver don't show anything when
something bad happens...

It seems to me like the amr_thread() function (in sys/dev/pci/amr.c) is the
place where these things should be noticed but that kernel thread is not
started if the AT_QUARTZ flag is on. Any ideas why so?

Martti