NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/40099: device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy



>Number:         40099
>Category:       kern
>Synopsis:       device_t/softc split broke cac(4)/ld(4): panic: iostat_unbusy
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 03 23:25:00 +0000 2008
>Originator:     Michael L. Hitch
>Release:        NetBSD 4.99.72
>Organization:
        Montana State University
>Environment:
        
        
System: NetBSD net3.msu.montana.edu 4.99.72 NetBSD 4.99.72 (GENERIC) #0: Tue 
Dec 2 23:18:05 MST 2008 
mhitch%net3.msu.montana.edu@localhost:/home/mhitch/NetBSD-current/OBJ/i386/home/mhitch/NetBSD-current/src/sys/arch/i386/compile/GENERIC
 i386
Architecture: i386
Machine: i386
>Description:
        The device_t/softc changes made to ld(4) and cac(4) cause a 'panic:
        iostat_unbusy' when booting on an HP server (DL360 in my case).
>How-To-Repeat:
        The usual:  build a kernel from sources after September 9 (the date the
        device_t/softc changes were made) and try to boot the kernel.

cac0 at pci0 dev 1 function 0: Compaq Integrated Array
cac0: interrupting at ioapic0 pin 3     
cac0: 2 channels, firmware <1.50>   
ld0 at cac0 unit 0: RAID1 array
ld0: 17359 MB, 8817 cyl, 64 head, 63 sec, 512 bytes/sect x 35553120 sectors
...
scsibus0: waiting 2 seconds for devices to settle...
fd0 at fdc0: busy < 0
panic: iostat_unbusy
fatal breakpoint trap in supervisor mode
trap type 1 code 0 eip c052e42c cs 8 eflags 246 cr2 0 ilevel 6
Stopped in pid 0.5 (system) at  netbsd:breakpoint+0x4:  popl    %ebp
db{0}> bt
breakpoint(c0a344c4,cccace78,c0a5c140,c0479f35,6,5,0,0,cccace7c,0) at netbsd:bre
akpoint+0x4
panic(c09f8b47,cccfaeb0,0,0,0,0,0,0,c3107ec8,cccefd10) at netbsd:panic+0x1b8
iostat_unbusy(cccfaeb0,200,100000,c0450007,cccfaf36,0,0,0,0,c3107ec8) at netbsd:
iostat_unbusy+0xc4
lddone(cccefd10,c3107ec8,0,c05160f8,c0a59d14,0,cccefd10,c3107ec8,cccefd10,cdecd0
00) at netbsd:lddone+0x3c
ld_cac_done(cccefd10,c3107ec8,0,200,2,cccfaeb0,0,cccfaeb0,0,cccfaf34) at netbsd:
ld_cac_done+0x6a
cac_ccb_done(cccfaeb0,cc3c74c0,c0a5c140,c316ea40,2,6,cccacf6c,c051acbf,cccfaeb0,
0) at netbsd:cac_ccb_done+0xb1
cac_intr(cccfaeb0,0,0,0,c0100cbf,0,c31b1f00,c010734d,c316ea40,cc835cf0) at netbs
d:cac_intr+0x39 
intr_biglock_wrapper(c316ea40,cc835cf0,0,0,0,0,0,0,0,0) at netbsd:intr_biglock_w
rapper+0x1f  
DDB lost frame for netbsd:Xintr_ioapic_level3+0xad, trying 0xcccacf74
Xintr_ioapic_level3() at netbsd:Xintr_ioapic_level3+0xad
--- interrupt ---
--- switch to interrupt stack ---
Xspllower(2,0,0,0,0,0,0,3,0,0) at netbsd:Xspllower+0xf
softint_dispatch(cc3d5520,2,0,0,0,0,cc835d90,cc835d28,cc3c74c0,28) at netbsd:sof
tint_dispatch+0x67
DDB lost frame for netbsd:Xsoftintr+0x3d, trying 0xcc835d88
Xsoftintr() at netbsd:Xsoftintr+0x3d
--- interrupt ---
fatal page fault in supervisor mode
trap type 6 code 0 eip c0530907 cs 8 eflags 10206 cr2 3a ilevel 8
kernel: supervisor trap page fault, code=0
Faulted in DDB; continuing...

>Fix:
        My workaround was to boot -ca and disable 'cac' and specify a root on
        a different controller;  the DL360 also has a Smart Array 5300.

        The fix will be to identify where the device_t/softc split went wrong
        [probably in sys/dev/ic/cac.c or sys/dev/ic/ld_cac.c] and fix it.

        I've started looking at this, but quickly get lost trying to follow
        which softc and which device_t stuff belongs to.  I'm going to try
        adding some debug code to show the addresses of the associated
        structures and try to make some sense of them and see what structure
        is actually being passed to iostat_unbusy(), unless someone more
        familiar with the device_t/softc can easily spot the error.

>Unformatted:
                Source date October 7
        
        


Home | Main Index | Thread Index | Old Index