Subject: Re: kern/32162: [netbsd-3.0] kernel dead-lock in MP system
To: Jason Thorpe <thorpej@shagadelic.org>
From: Andreas Wrede <andreas@planix.com>
List: netbsd-bugs
Date: 01/11/2006 22:07:55
--Apple-Mail-2--185714117
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


On Dec 7, 2005, at 11:42 , Jason Thorpe wrote:

>
> On Dec 7, 2005, at 8:08 AM, Andreas Wrede wrote:
>
>> Running with a kernel with DIAGNOSTIC, LOCKDEBUG and DEBUG turned  
>> on produced two panics over the last week:
>>
>> Nov 30
>>
>> panic: kernel debugging assertion "(v == __SIMPLELOCK_LOCKED) ||  
>> (v == __SIMPLELOCK_UNLOCKED)" failed: file "/u1/netbsd-3.0/src/sys/ 
>> arch/x86/x86/lock_machdep.c",
>
> This very likely means that something is smashing memory.

I experienced a number of different panics/asserts, which probably  
confirms your assessment of memory corruption somewhere.

Now the system has been moved to new hardware, a TYAN S2892 K8SE  
motherboard with two Opteron 246 processors (see dmesg below). Two  
changes on the software side: The 1TB filesystem on the previous  
hardware was a UFS2 fs, now it's again a UFS1 and the other  
filesystem are now on a RAID1 set.  I am still running a DIAGNOSTIC,  
LOCKDEBUG and DEBUG and after an uptime of almost 4 days, I get   
'cpu0: spinout'.

Any idea what this is and what to do next?

cpu0: spinout
Stopped in pid 18.1 (aiodoned) at       netbsd:cpu_Debugger 
+0x4:        leave
db{0}> bt
cpu_Debugger(c067db95,0,c2bf7cdc,cd262b7c,c0713354) at  
netbsd:cpu_Debugger+0x4
__cpu_simple_lock(c0713a1c,0,c2993df4,c36b6100,cd262b7c) at  
netbsd:__cpu_simple_lock+0x93
printf_nolog(c06769f5,cd262b7c,c069ace4,cd262c30,a) at  
netbsd:printf_nolog+0x32
lock_printf(c069ace4,a080101,cd262c54,c033d966,c3874200) at  
netbsd:lock_printf+0x4c
_simple_lock(c0713354,c06cb1c0,440,c,c2b360c0) at netbsd:_simple_lock 
+0x235
selwakeup(c2b360c8,989680,0,66,cc2016c0) at netbsd:selwakeup+0x99
ptsstart(cc2016c0,0,c06cb840,7,cc2016c0) at netbsd:ptsstart+0x85
ttstart(cc2016c0,cc2016c0,9ac,1,0) at netbsd:ttstart+0x1e
tputchar(66,7,cc2016c0,cc2016c0,cbfbb024) at netbsd:tputchar+0x79
putchar(66,5,0,6,0) at netbsd:putchar+0x49
kprintf(c067f495,5,0,0,cd262de0) at netbsd:kprintf+0x5c
printf(c067f495,c067f385,cd262e60,1,1ebd4) at netbsd:printf+0x46
trap() at netbsd:trap+0x106
--- trap (number 6) ---
pmap_activate(cc20b8c4,cc2101d0,4c,0,c034cbf9) at netbsd:pmap_activate 
+0x39
mpidle(cc20b8c4,0,1d9,c0786ff0,c0789608) at netbsd:mpidle+0xcb
ltsleep(c0789600,204,c067744e,0,c0789608) at netbsd:ltsleep+0x4d0
uvm_aiodone_daemon(cc20b8c4,842000,84b000,0,c0100321) at  
netbsd:uvm_aiodone_daemon+0x15f
db{0}> machine cpu 1
using CPU 1
db{0}> bt
__cpu_simple_lock(c0713354,c035241b,c07620a8,297,10b) at  
netbsd:__cpu_simple_lock+0x6f
_simple_lock(c0713354,c06c8d00,47b,c,c21a303c) at netbsd:_simple_lock 
+0x7a
schedclock(cf1859e8,c2266a00,c355a788,c21c4838,ce453c78) at  
netbsd:schedclock+0x58
statclock(ce453cbc,c01f7f52,c21c4800,80,c066ea25) at netbsd:statclock 
+0xeb
hardclock(ce453cbc,3,c03902b8,ce453cb4,0) at netbsd:hardclock+0x5f3
lapic_clockintr(0,0,c0330010,30,1310010) at netbsd:lapic_clockintr+0x48
Xresume_lapic_ltimer() at netbsd:Xresume_lapic_ltimer+0x1b
--- interrupt ---
Xspllower(0,c06c6880,390,206,0) at netbsd:Xspllower+0xe
_lockmgr(c0764180,400006,0,c06d69c0,da4) at netbsd:_lockmgr+0x250
pmap_enter(cfcabc28,81fe000,37762000,3,22) at netbsd:pmap_enter+0x4d6
uvm_fault(ce8ec2b4,81fe000,0,2,2) at netbsd:uvm_fault+0x976
trap() at netbsd:trap+0x36f
--- trap (number 6) ---
0xbdb685fc:
db{0}> sync
cpu1: spinout while in debugger



Dmesg output:

NetBSD 3.0_STABLE (PLANIX.MP) #0: Sat Jan  7 10:19:22 EST 2006
         root@whome.planix.com:/u1/netbsd-3.0/obj.i386/sys/arch/i386/ 
compile/PLANIX.MP
total memory = 1022 MB
avail memory = 982 MB
BIOS32 rev. 0 found at 0xfd5c0
mainbus0 (root)
mainbus0: Intel MP Specification (Version 1.4) (AMD      HAMMER      )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Unknown K7 (Athlon) (686-class), 2009.32 MHz, id 0xf5a
cpu0: features 78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 78bfbff<PGE,MCA,CMOV,PAT,PSE36,MPC,MMX>
cpu0: features 78bfbff<FXSR,SSE,SSE2>
cpu0: "AMD Opteron(tm) Processor 246"
cpu0: calibrating local timer
cpu0: apic clock running at 200 MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: starting
cpu1: AMD Unknown K7 (Athlon) (686-class), 2009.26 MHz, id 0xf5a
cpu1: features 78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu1: features 78bfbff<PGE,MCA,CMOV,PAT,PSE36,MPC,MMX>
cpu1: features 78bfbff<FXSR,SSE,SSE2>
cpu1: "AMD Opteron(tm) Processor 246"
mpbios: bus 0 is type PCI
mpbios: bus 1 is type PCI
mpbios: bus 2 is type PCI
mpbios: bus 3 is type PCI
mpbios: bus 8 is type PCI
mpbios: bus 9 is type PCI
mpbios: bus 10 is type PCI
mpbios: bus 11 is type ISA
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, version 11, 24 pins
ioapic1 at mainbus0 apid 3 (I/O APIC)
ioapic1: pa 0xdf200000, version 11, 4 pins
ioapic2 at mainbus0 apid 4 (I/O APIC)
ioapic2: pa 0xdf201000, version 11, 4 pins
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
Nvidia product 0x005e (miscellaneous memory, revision 0xa3) at pci0  
dev 0 function 0 not configured
pcib0 at pci0 dev 1 function 0
pcib0: Nvidia product 0x0051 (rev. 0xa3)
Nvidia nForce4 SMBus (SMBus serial bus, revision 0xa2) at pci0 dev 1  
function 1 not configured
ohci0 at pci0 dev 2 function 0: Nvidia product 0x005a (rev. 0xa2)
ohci0: interrupting at ioapic0 pin 10 (irq 10)
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: Nvidia OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 10 ports with 10 removable, self powered
ehci0 at pci0 dev 2 function 1: Nvidia product 0x005b (rev. 0xa3)
ehci0: interrupting at ioapic0 pin 11 (irq 11)
ehci0: BIOS has given up ownership
ehci0: EHCI version 1.0
ehci0: companion controller, 4 ports each: ohci0
usb1 at ehci0: USB revision 2.0
uhub1 at usb1
uhub1: Nvidia EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub1: single transaction translator
uhub1: 10 ports with 10 removable, self powered
viaide0 at pci0 dev 6 function 0
viaide0: NVIDIA nForce4 IDE Controller (rev. 0xf2)
viaide0: bus-master DMA support present
viaide0: primary channel configured to compatibility mode
viaide0: primary channel ignored (disabled)
viaide0: secondary channel configured to compatibility mode
viaide0: secondary channel interrupting at ioapic0 pin 15 (irq 15)
atabus0 at viaide0 channel 1
viaide1 at pci0 dev 7 function 0
viaide1: NVIDIA nForce4 Serial ATA Controller (rev. 0xf3)
viaide1: bus-master DMA support present
viaide1: primary channel wired to native-PCI mode
viaide1: using ioapic0 pin 10 (irq 10) for native-PCI interrupt
atabus1 at viaide1 channel 0
viaide1: secondary channel wired to native-PCI mode
atabus2 at viaide1 channel 1
viaide2 at pci0 dev 8 function 0
viaide2: NVIDIA nForce4 Serial ATA Controller (rev. 0xf3)
viaide2: bus-master DMA support present
viaide2: primary channel wired to native-PCI mode
viaide2: using ioapic0 pin 11 (irq 11) for native-PCI interrupt
atabus3 at viaide2 channel 0
viaide2: secondary channel wired to native-PCI mode
atabus4 at viaide2 channel 1
ppb0 at pci0 dev 9 function 0: Nvidia product 0x005c (rev. 0xa2)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
isp0 at pci1 dev 4 function 0: QLogic FC-AL and Fabric HBA
isp0: interrupting at ioapic0 pin 11 (irq 11)
scsibus0 at isp0: 256 targets, 8 luns per target
vga1 at pci1 dev 6 function 0: ATI Technologies Rage XL (rev. 0x27)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
fxp0 at pci1 dev 8 function 0: i82550 Ethernet, rev 16
fxp0: interrupting at ioapic0 pin 10 (irq 10)
fxp0: Ethernet address 00:e0:81:30:d6:0a
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ppb1 at pci0 dev 13 function 0: Nvidia product 0x005d (rev. 0xa3)
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci0 dev 14 function 0: Nvidia product 0x005d (rev. 0xa3)
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
pchb0 at pci0 dev 24 function 0
pchb0: Advanced Micro Devices AMD64 HyperTransport configuration  
(rev. 0x00)
pchb1 at pci0 dev 24 function 1
pchb1: Advanced Micro Devices AMD64 Address Map configuration (rev.  
0x00)
pchb2 at pci0 dev 24 function 2
pchb2: Advanced Micro Devices AMD64 DRAM configuration (rev. 0x00)
pchb3 at pci0 dev 24 function 3
pchb3: Advanced Micro Devices AMD64 Miscellaneous configuration (rev.  
0x00)
pchb4 at pci0 dev 25 function 0
pchb4: Advanced Micro Devices AMD64 HyperTransport configuration  
(rev. 0x00)
pchb5 at pci0 dev 25 function 1
pchb5: Advanced Micro Devices AMD64 Address Map configuration (rev.  
0x00)
pchb6 at pci0 dev 25 function 2
pchb6: Advanced Micro Devices AMD64 DRAM configuration (rev. 0x00)
pchb7 at pci0 dev 25 function 3
pchb7: Advanced Micro Devices AMD64 Miscellaneous configuration (rev.  
0x00)
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
com1: console
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
lm0 at isa0 port 0x290-0x297: W83627HF
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
pci4 at mainbus0 bus 8
pci4: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
ppb3 at pci4 dev 10 function 0: Advanced Micro Devices AMD8131 PCI-X  
Tunnel (rev. 0x12)
pci5 at ppb3 bus 9
pci5: i/o space, memory space enabled
Advanced Micro Devices AMD8131 IO Apic (interrupt system, interface  
0x10, revision 0x01) at pci4 dev 10 function 1 not configured
ppb4 at pci4 dev 11 function 0: Advanced Micro Devices AMD8131 PCI-X  
Tunnel (rev. 0x12)
pci6 at ppb4 bus 10
pci6: i/o space, memory space enabled
isp1 at pci6 dev 3 function 0: QLogic FC-AL and Fabric HBA
isp1: interrupting at ioapic2 pin 2 (irq 10)
scsibus1 at isp1: 256 targets, 8 luns per target
bge0 at pci6 dev 9 function 0: Broadcom BCM5704C Dual Gigabit Ethernet
bge0: interrupting at ioapic2 pin 0 (irq 11)
bge0: ASIC BCM5704 A3 (0x2003), Ethernet address 00:e0:81:30:d6:7c
brgphy0 at bge0 phy 1: BCM5704 1000BASE-T media interface, rev. 0
brgphy0: using BCM5704 DSP patch
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,  
1000baseT-FDX, auto
bge1 at pci6 dev 9 function 1: Broadcom BCM5704C Dual Gigabit Ethernet
bge1: interrupting at ioapic2 pin 1 (irq 10)
bge1: ASIC BCM5704 A3 (0x2003), Ethernet address 00:e0:81:30:d6:7d
brgphy1 at bge1 phy 1: BCM5704 1000BASE-T media interface, rev. 0
brgphy1: using BCM5704 DSP patch
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,  
1000baseT-FDX, auto
Advanced Micro Devices AMD8131 IO Apic (interrupt system, interface  
0x10, revision 0x01) at pci4 dev 11 function 1 not configured
ioapic0: enabling
ioapic1: enabling
ioapic2: enabling
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
raidattach: Asked for 8 units
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <HL-DT-STDVD-ROM GDR8164B, , 0L06> cdrom  
removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(viaide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33)  
(using DMA)
sd0 at scsibus0 target 0 lun 0: <APPLE, Xserve RAID, 1.26> disk fixed
sd0: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x  
2171240448 sectors
sd1 at scsibus1 target 0 lun 0: <APPLE, Xserve RAID, 1.26> disk fixed
sd1: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x  
2171240448 sectors
wd0 at atabus1 drive 0: <WDC WD1600JS-22MHB0>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 149 GB, 310101 cyl, 16 head, 63 sec, 512 bytes/sect x 312581808  
sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd0(viaide1:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133)  
(using DMA)
wd1 at atabus3 drive 0: <WDC WD1600JS-00MHB0>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 149 GB, 310101 cyl, 16 head, 63 sec, 512 bytes/sect x 312581808  
sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd1(viaide2:0:0): using PIO mode 4, Ultra-DMA mode 6 (Ultra/133)  
(using DMA)
Searching for RAID components...
Component on: wd0a: 312581745
    Row: 0 Column: 0 Num Rows: 1 Num Columns: 2
    Version: 2 Serial Number: 20051218 Mod Counter: 156
    Clean: No Status: 0
    sectPerSU: 128 SUsPerPU: 1 SUsPerRU: 1
    RAID Level: 1  blocksize: 512 numBlocks: 312581632
    Autoconfig: Yes
    Contains root partition: Yes
    Last configured as: raid0
Component on: wd1a: 312581745
    Row: 0 Column: 1 Num Rows: 1 Num Columns: 2
    Version: 2 Serial Number: 20051218 Mod Counter: 156
    Clean: No Status: 0
    sectPerSU: 128 SUsPerPU: 1 SUsPerRU: 1
    RAID Level: 1  blocksize: 512 numBlocks: 312581632
    Autoconfig: Yes
    Contains root partition: Yes
    Last configured as: raid0
Found: wd0a at 0
Found: wd1a at 1
RAID autoconfigure
Configuring raid0:
Starting autoconfiguration of RAID set...
Looking for 0 in autoconfig
Found: wd0a at 0
Looking for 1 in autoconfig
Found: wd1a at 1
raid0: allocating 20 buffers of 65536 bytes.
raid0: RAID Level 1
raid0: Components: /dev/wd0a /dev/wd1a
raid0: Total Sectors: 312581632 (152627 MB)
boot device: raid0
root on raid0a dumps on raid0b
mountroot: trying smbfs...
mountroot: trying msdos...
mountroot: trying cd9660...
mountroot: trying nfs...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
cpu1: CPU 1 running
init: copying out path `/sbin/init' 11
mag 0 21:1
mag 1 2e:2
mag 2 72:3
mag 3 65:4
mag 4 73:5
mag 5 65:6
mag 6 74:7
mag 7 2d:8
mag 8 78:7f
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)


# df
Filesystem  1K-blocks      Used     Avail Capacity  Mounted on
/dev/raid0a   3810588   1466130   2153930    40%    /
/dev/raid0f   4129638   1130268   2792890    28%    /var
/dev/raid0g  65381338   7480986  54631286    12%    /u1
/dev/raid0e   2064990      9928   1951814     0%    /lhome
/dev/raid0h  76927856   3612282  69469182     4%    /u2
/dev/sd0a   1057093094 619642356 384596084    61%    /u5


-- 
     aew


--Apple-Mail-2--185714117
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (Darwin)

iD8DBQFDxcgOEh/h9J/TQyERAoYCAKDe4DM5HNzpAvDMx4sCD64YZT6WoQCePGF4
HEOJWIQWSrag0QAm3rAWZrc=
=Vjfh
-----END PGP SIGNATURE-----

--Apple-Mail-2--185714117--