NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/38183: LOCKDEBUG causes INSTALL_LARGE kernel to fail with "simple_lock: lock held" during boot



>Number:         38183
>Category:       kern
>Synopsis:       LOCKDEBUG causes INSTALL_LARGE kernel to fail with 
>"simple_lock: lock held" during boot
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Mar 06 17:15:04 +0000 2008
>Originator:     Greg A. Woods
>Release:        NetBSD 4.0_STABLE 2008/03/03
>Organization:
Planix, Inc.; Toronto, Ontario; Canada
>Environment:
System: NetBSD 4.0_STABLE GENERIC.MP
Architecture: i386
Machine: i386
>Description:

        In an effort to provide more information about MP locking bugs
        that have been plauging my systems of late I've been compiling
        the whole system with -DLOCKDEBUG (so that I don't loose
        runtime features and facilities necessary for production use)

        Yesterday I tested an install CD built in this way with the
        following result on two quite different machines, first an Asus
        PSCH-SR/SATA system, and then this test via serial console boot
        ("consdev com0" at the bootloader prompt):


> boot
booting cd0a:netbsd
4238076+5321508+191148 [286864+271089]=0x9d64a8
kenter: 0x00001000
acpi: wakecode is installed at 0x1000, size=376
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 4.0_STABLE (INSTALL_LARGE) #0: Tue Mar  4 21:08:47 EST 2008
      
woods@once:/rest/build/woods/once/netbsd-4-i386-i386-ppro-obj/rest/work/woods/m-NetBSD-4/sys/arch/i386/compile/INSTALL_LARGE
total memory = 3839 MB
rbus: rbus_min_start set to 0xc0000000
avail memory = 3740 MB
rnd: initialised (4096) with counter
timecounter: Timecounters tick every 10.000 msec
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
BIOS32 rev. 0 found at 0xffe90
mainbus0 (root)
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel Xeon (686-class), 2384.46 MHz, id 0xf27
cpu0: "Intel(R) Xeon(TM) CPU 2.40GHz"
cpu0: enabling thermal monitor 1 ... enabled.
cpu0: calibrating local timer
cpu0: apic clock running at 99 MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: not started
cpu2 at mainbus0: apid 1 (application processor)
cpu2: not started
cpu3 at mainbus0: apid 7 (application processor)
cpu3: not started
ioapic0 at mainbus0 apid 8 (I/O APIC)
ioapic0: pa 0xfec00000, version 11, 16 pins
ioapic0: misconfigured as apic 0
ioapic0: remapped to apic 8
ioapic1 at mainbus0 apid 9 (I/O APIC)
ioapic1: pa 0xfec01000, version 11, 16 pins
ioapic1: misconfigured as apic 0
ioapic1: remapped to apic 9
ioapic2 at mainbus0 apid 10 (I/O APIC)
ioapic2: pa 0xfec02000, version 11, 16 pins
ioapic2: misconfigured as apic 0
ioapic2: remapped to apic 10
acpi0 at mainbus0: Advanced Configuration and Power Interface
acpi0: fixed-feature power button present
timecounter: Timecounter "ACPI-Safe" frequency 3579545 Hz quality 900
ACPI-Safe 32-bit timer
ACPI Object Type 'Processor' (0x0c) at acpi0 not configured
ACPI Object Type 'Processor' (0x0c) at acpi0 not configured
ACPI Object Type 'Processor' (0x0c) at acpi0 not configured
ACPI Object Type 'Processor' (0x0c) at acpi0 not configured
PNP0A03 [PCI/PCI-X Host Bridge] at acpi0 not configured
PNP0200 [AT DMA Controller] at acpi0 not configured
PNP0C04 [Math Coprocessor] at acpi0 not configured
PNP0000 [AT Interrupt Controller] at acpi0 not configured
PNP0800 [AT-style speaker sound] at acpi0 not configured
PNP0100 [AT Timer] at acpi0 not configured
PNP0700 [PC standard floppy disk controller] at acpi0 not configured
PNP0303 [IBM Enhanced (101/102-key, PS/2 mouse support)] at acpi0 not configured
PNP0F13 [PS/2 Port for PS/2-style Mice] at acpi0 not configured
PNP0501 [16550A-compatible COM port] at acpi0 not configured
PNP0501 [16550A-compatible COM port] at acpi0 not configured
PNP0B00 [AT Real-Time Clock] at acpi0 not configured
PNP0C01 [System Board] at acpi0 not configured
PNP0A03 [PCI/PCI-X Host Bridge] at acpi0 not configured
PNP0A03 [PCI/PCI-X Host Bridge] at acpi0 not configured
PNP0A03 [PCI/PCI-X Host Bridge] at acpi0 not configured
PNP0A03 [PCI/PCI-X Host Bridge] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
PNP0C0F [PCI interrupt link device] at acpi0 not configured
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: vendor 0x1166 product 0x0012 (rev. 0x13)
pchb1 at pci0 dev 0 function 1
pchb1: vendor 0x1166 product 0x0012 (rev. 0x00)
pchb2 at pci0 dev 0 function 2
pchb2: vendor 0x1166 product 0x0000 (rev. 0x00)
pchb2: unknown ServerWorks chip ID 0x0000; trying to attach PCI buses behind it
vendor 0x1028 product 0x000c (undefined subclass 0x00) at pci0 dev 4 function 0 
not configured
vendor 0x1028 product 0x0008 (undefined subclass 0x00) at pci0 dev 4 function 1 
not configured
vendor 0x1028 product 0x000d (undefined subclass 0x00) at pci0 dev 4 function 2 
not configured
vga1 at pci0 dev 14 function 0: vendor 0x1002 product 0x4752 (rev. 0x27)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
pchb3 at pci0 dev 15 function 0
pchb3: vendor 0x1166 product 0x0201 (rev. 0x93)
rccide0 at pci0 dev 15 function 1
rccide0: ServerWorks CSB5 IDE Controller (rev. 0x93)
rccide0: bus-master DMA support present
rccide0: primary channel configured to compatibility mode
rccide0: primary channel interrupting at ioapic0 pin 14 (irq 14)
atabus0 at rccide0 channel 0
rccide0: secondary channel wired to compatibility mode
rccide0: secondary channel interrupting at ioapic0 pin 15 (irq 15)
atabus1 at rccide0 channel 1
ohci0 at pci0 dev 15 function 2: vendor 0x1166 product 0x0220 (rev. 0x05)
ohci0: interrupting at ioapic0 pin 5 (irq 5)
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: vendor 0x1166 OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pcib0 at pci0 dev 15 function 3
pcib0: vendor 0x1166 product 0x0225 (rev. 0x00)
pchb4 at pci0 dev 16 function 0
pchb4: vendor 0x1166 product 0x0101 (rev. 0x03)
pci1 at pchb4 bus 3
pci1: i/o space, memory space enabled
bge0 at pci1 dev 6 function 0: Broadcom BCM5701 Gigabit Ethernet
bge0: interrupting at ioapic1 pin 12 (irq 10)
bge0: ASIC BCM5701 B5 (0x0105), Ethernet address 00:06:5b:ed:e2:d1
brgphy0 at bge0 phy 1: BCM5701 1000BASE-T media interface, rev. 0
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
bge1 at pci1 dev 8 function 0: Broadcom BCM5701 Gigabit Ethernet
bge1: interrupting at ioapic1 pin 13 (irq 7)
bge1: ASIC BCM5701 B5 (0x0105), Ethernet address 00:06:5b:ed:e2:d2
brgphy1 at bge1 phy 1: BCM5701 1000BASE-T media interface, rev. 0
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
pchb5 at pci0 dev 16 function 2
pchb5: vendor 0x1166 product 0x0101 (rev. 0x03)
pci2 at pchb5 bus 4
pci2: i/o space, memory space enabled
ppb0 at pci2 dev 8 function 0: vendor 0x8086 product 0x0309 (rev. 0x01)
pci3 at ppb0 bus 5
pci3: i/o space, memory space enabled
vendor 0x9005 product 0x00c5 (SCSI mass storage, revision 0x01) at pci3 dev 6 
function 0 not configured
vendor 0x9005 product 0x00c5 (SCSI mass storage, revision 0x01) at pci3 dev 6 
function 1 not configured
aac0 at pci2 dev 8 function 1: Dell PERC 3/Di
aac0: interrupting at ioapic1 pin 14 (irq 11)
aac0: i960RX at 100MHz, 128MB mem (118MB cache), optional battery present
ld0 at aac0 unit 0: RAID 5
ld0: 135 GB, 17700 cyl, 255 head, 63 sec, 512 bytes/sect x 284365824 sectors
rnd: ld0 attached as an entropy source (collecting)
pchb6 at pci0 dev 17 function 0
pchb6: vendor 0x1166 product 0x0101 (rev. 0x03)
pci4 at pchb6 bus 1
pci4: i/o space, memory space enabled
aac1 at pci4 dev 6 function 0: HP NetRAID-4M
aac1: interrupting at ioapic1 pin 0 (irq 11)
aac1: StrongARM SA110 at 233MHz, 144MB mem (128MB cache), required battery 
present
pchb7 at pci0 dev 17 function 2
pchb7: vendor 0x1166 product 0x0101 (rev. 0x03)
pci5 at pchb7 bus 2
pci5: i/o space, memory space enabled
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0 mux 1
wskbd0: connecting to wsdisplay0
rnd: pckbd0 attached as an entropy source (collecting)
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
ioapic0: enabling
ioapic1: enabling
ioapic2: enabling
timecounter: Timecounter "TSC" frequency 2384490560 Hz quality 800
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0

simple_lock: lock held
lock: 0xc0519db4, currently at: 
/rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:664
last locked: /rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:497
last unlocked: /rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:768
kernel: supervisor trap page fault, code=0
Stopped at      netbsd:db_read_bytes+0x12:      movl    0(%eax),%eax
db> call simple_lock_dump
all simple locks:
0xc0519db4 CPU 0 /rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:497
0
db> trace
db_read_bytes(4,4,c0ada87c,c0292aa0,c0ada880) at netbsd:db_read_bytes+0x12
db_get_value(4,4,0,0,0) at netbsd:db_get_value+0x17
db_stack_trace_print(c0ada958,1,ffff,c04d7ad6,c0292a58) at netbsd:db_stack_trace
_print+0x180
_simple_lock(c0519db4,c04e0778,298,292,c0a48fc0) at netbsd:_simple_lock+0xb6
endtsleep(c0a48fc0,c04e0d65,182,c,c04defdc) at netbsd:endtsleep+0x32
softclock(0,c04e4c76,65,0,0) at netbsd:softclock+0x1e1
softintr_dispatch(0,c0ad0010,c04d0030,10,c0ad0010) at netbsd:softintr_dispatch+0
xa9
Xsoftclock() at netbsd:Xsoftclock+0x26
--- interrupt ---
cpu_switch(c0a48fc0,0,1,1,1) at netbsd:cpu_switch+0xa5
ltsleep(c63f1200,20,c04fdf4f,19,0) at netbsd:ltsleep+0x2be
fdprobe(c63f1200,c0510c78,c0adabf4,0,c0510c78) at netbsd:fdprobe+0xc7
mapply(1,ffffffff,c0adab98,c02aea48,c04e0ba8) at netbsd:mapply+0x2a
config_search_loc(0,c63f1200,0,0,c0adabf4) at netbsd:config_search_loc+0x8e
config_found_sm_loc(c63f1200,0,0,c0adabf4,c0398639) at netbsd:config_found_sm_lo
c+0x26
config_found(c63f1200,c0adabf4,c0398639,c04ace40,c0519b8c) at netbsd:config_foun
d+0x1a
fdcfinishattach(c63f1200,c05197e0,c0adac38,c028b1ce,c05195e0) at netbsd:fdcfinis
hattach+0x10e
config_process_deferred(0,c0ad7010,c0adac78,c0285b76,0) at netbsd:config_process
_deferred+0x42
configure(0,0,0,0,0) at netbsd:configure+0x65
main(fbff,c01002ac,0,0,0) at netbsd:main+0xd3
db> show reg
ds          0x10
es          0x10
fs          0x30
gs          0x10
edi         0xc04e0778  copyright+0x23518
esi         0
ebp         0xc0ada854  _prop_array_pool+0x8cb74
ebx         0x4
edx         0xc0ada87c  _prop_array_pool+0x8cb9c
ecx         0xc0a90000  _prop_array_pool+0x42320
eax         0x4
eip         0xc02efb8e  db_read_bytes+0x12
cs          0x8
eflags      0x10246
esp         0xc0ada850  _prop_array_pool+0x8cb70
ss          0x10
netbsd:db_read_bytes+0x12:      movl    0(%eax),%eax
db> ps
 PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND    WAIT
 0               -1        0          0 2 0x20200    1          swapper fdprobe
db> reboot
syncing disks... 
simple_lock: lock held
lock: 0xc0519db4, currently at: 
/rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:1237
last locked: /rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:497
last unlocked: /rest/work/woods/m-NetBSD-4/sys/kern/kern_synch.c:768
kernel: supervisor trap page fault, code=0
Stopped at      netbsd:db_read_bytes+0x12:      movl    0(%eax),%eax
db> reboot
rebooting...


>How-To-Repeat:

        build INSTALL_LARGE with -DLOCKDEBUG and boot?

        my /etc/mk.conf contains:

CFLAGS +=       -DLOCKDEBUG


        Resulting in the kernel sources being compiled with commands
        like the following:

/rest/build/woods/once/netbsd-4-i386-i386-tools/bin/i386--netbsdelf-gcc -O2  
-DLOCKDEBUG -ffreestanding -fno-zero-initialized-in-bss -march=i486 -mtune=i486 
 -Os -Werror -Wall -Wno-main -Wno-format-zero-length -Wpointer-arith 
-Wmissing-prototypes -Wstrict-prototypes -Wswitch -Wshadow -Wcast-qual 
-Wwrite-strings -Wno-sign-compare -Wno-pointer-sign -Wno-attributes -Wextra 
-Wno-unused-parameter  -fno-strict-aliasing    -Di386 -I. 
-I/rest/work/woods/m-NetBSD-4/sys/contrib/dev/ath/netbsd 
-I/rest/work/woods/m-NetBSD-4/sys/../common/include 
-I/rest/work/woods/m-NetBSD-4/sys/arch  -I/rest/work/woods/m-NetBSD-4/sys 
-nostdinc  -DMAXUSERS=2 -D_KERNEL -D_KERNEL_OPT 
-I/rest/work/woods/m-NetBSD-4/sys/lib/libkern/../../../common/lib/libc/quad 
-I/rest/work/woods/m-NetBSD-4/sys/lib/libkern/../../../common/lib/libc/string 
-I/rest/work/woods/m-NetBSD-4/sys/lib/libkern/../../../common/lib/libc/arch/i386/string


        Note that the GENERIC.MP kernel built from the same tree and
        using the same options works just fine.

>Fix:

        unknown



Home | Main Index | Thread Index | Old Index