Subject: kern/22505: twe driver doesn't probe right with set in degraded mode.
To: None <gnats-bugs@gnats.netbsd.org>
From: None <tls@netbsd.org>
List: netbsd-bugs
Date: 08/16/2003 20:23:09
>Number:         22505
>Category:       kern
>Synopsis:       With a RAID set in degraded mode, the twe driver splodes at boot.
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Aug 16 20:24:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Thor Lancelot Simon
>Release:        NetBSD/i386 1.6W as of 2003-08-10
>Organization:
	The NetBSD Project
>Environment:
NetBSD enola-gay 1.6W NetBSD 1.6W (ENOLA-GAY) #8: Sun Aug 10 17:07:21 EDT 2003  tls@rekusant:/usr/src/sys/arch/i386/compile/ENOLA-GAY i386
Architecture: i386
Machine: i386
>Description:
	My 3ware Escalade 6410 controller often decides after a sudden power failure that one of the components of my RAID 10 set has failed.  It does not
automatically initiate a rebuild -- it's necessary to go into the BIOS and
reassign the disk to the set in order to cause this.  Until then, the set
is in DEGRADED mode (as shown by the BIOS) but is functional (the BIOS can
use it to boot the NetBSD kernel).

This is obviously a controller firmware bug.  However, it interacts in an
extremely bad way with a bug in the NetBSD driver:

twe0 at pci2 dev 5 function 0: 3ware Escalade
twe0: interrupting at apic 0 int 11 (irq 11)
twe0: no attention interrupt
twe0: reset failed

No logical disk devices probe, and the kernel fails to mount its root
filesystem:

boot device: <unknown>
device ld0 (0x1300) not configured

>How-To-Repeat:
Repeatedly yank the power cord out of a machine with a 6410 card and four
disks.  Eventually, it'll decide one of the disks has failed; then the
condition described here will occur.
>Fix:
Unknown.
>Release-Note:
>Audit-Trail:
>Unformatted: