NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/46885: NetBSD 6.0_RC1 spontaneously reboots as kernel starts to load



>Number:         46885
>Category:       kern
>Synopsis:       NetBSD 6.0_RC1 spontaneously reboots as kernel starts to load
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 31 21:05:00 +0000 2012
>Originator:     Dave Tyson
>Release:        NetBSD 6.0_RC1
>Organization:
        Wirral Caving Group
>Environment:
System: NetBSD darkstar.anduin.org.uk 6.0_RC1 NetBSD 6.0_RC1 (GENERICD) #0: Fri 
Aug 31 12:48:38 BST 2012 
root%darkstar.anduin.org.uk@localhost:/usr/obj/sys/arch/i386/compile/GENERICD 
i386
Architecture: i386
Machine: i386
>Description:
        Problem occurs on a particular desktop system using a standard
Intel D865GLC/D865PE50 motherboard, Pentium 4 2.6Ghz processor. System has 
been stable for many years and ran NetBSD 5 and was upgraded to NetBSD 6
when it was tagged. Regularly updated from source and worked fine with
GENERIC kernel.
After cvs'ing up to RC1 and building a new GENERIC kernel this would cause
the system to reboot just after the kernel loading message, but before the
version announcement.

Investigation showed that a GENERIC kernel with options DIAGNOSTIC would boot
successfully and the problem never showed up before as all NetBSD 6 beta/beta2
GENERIC kernels had options DIAGNOSTIC on by default and it was only removed
for RC1.

Testing with a serial console while loading a GENERIC kernel shows the standard
segment size messages followed by:

Loading /stand/i386/6.0/modules/ffs/ffs.kmod

the system reboots immediately after this with no other messages. i.e there is
no 'Loaded initial systab...'

Compiling a kernel with options DIAGNOSTIC commented out and options DEBUG
uncommented and testing that works perfectly as well!

As part of a sanity check I noticed that the root fs (128M) was FFSV1 and so
trashed and recreated it as FFSV2, reloaded it and updated the boot block. There
were no changes noticed, GENERIC still failed whereas GENERIC/DEBUG etc worked

Dmesg from DEBUG kernel below:

NetBSD 6.0_RC1 (GENERICD) #0: Fri Aug 31 12:48:38 BST 2012
        
root%darkstar.anduin.org.uk@localhost:/usr/obj/sys/arch/i386/compile/GENERICD
total memory = 1022 MB
avail memory = 992 MB
timecounter: Timecounters tick every 10.000 msec
cprng kernel: WARNING insufficient entropy at creation.
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
RM plc                                                          (               
        )
mainbus0 (root)
cpu0 at mainbus0 apid 0: Intel(R) Pentium(R) 4 CPU 2.60GHz, id 0xf29
cpu1 at mainbus0 apid 1: Intel(R) Pentium(R) 4 CPU 2.60GHz, id 0xf29
ioapic0 at mainbus0 apid 2: pa 0xfec00000, version 20, 24 pins
acpi0 at mainbus0: Intel ACPICA 20110623
acpi0: X/RSDT: OemId <INTEL ,D865GLC ,20050804>, AslId <MSFT,00000097>
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Safe" frequency 3579545 Hz quality 900
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x43 irq 0
pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1
pcppi1 at acpi0 (SPKR, PNP0800): io 0x61
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
npx1 at acpi0 (COPR, PNP0C04): io 0xf0-0xff irq 13
npx1: reported by CPUID; using exception 16
FDC0 (PNP0700) at acpi0 not configured
UAR1 (PNP0501) at acpi0 not configured
LPT (PNP0400) at acpi0 not configured
SYSR (PNP0C02) at acpi0 not configured
FWH (INT0800) at acpi0 not configured
OSYS (PNP0C02) at acpi0 not configured
SYSM (PNP0C01) at acpi0 not configured
acpibut0 at acpi0 (SLPB, PNP0C0E-29): ACPI Sleep Button
apm0 at acpi0: Power Management spec V1.2
pckbd0 at pckbc1 (kbd slot)
pckbc1: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
attimer1: attached to pcppi1
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 0x8086 product 0x2570 (rev. 0x02)
agp0 at pchb0: aperture at 0xf8000000, size 0x4000000
ppb0 at pci0 dev 1 function 0: vendor 0x8086 product 0x2571 (rev. 0x02)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
vga1 at pci1 dev 0 function 0: vendor 0x10de product 0x0170 (rev. 0xa3)
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation), using wskbd0
wsmux1: connecting to wsdisplay0
drm at vga1 not configured
uhci0 at pci0 dev 29 function 0: vendor 0x8086 product 0x24d2 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1: vendor 0x8086 product 0x24d4 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 19
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2: vendor 0x8086 product 0x24d7 (rev. 0x02)
uhci2: interrupting at ioapic0 pin 18
usb2 at uhci2: USB revision 1.0
uhci3 at pci0 dev 29 function 3: vendor 0x8086 product 0x24de (rev. 0x02)
uhci3: interrupting at ioapic0 pin 16
usb3 at uhci3: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 0x8086 product 0x24dd (rev. 0x02)
ehci0: interrupting at ioapic0 pin 23
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2 uhci3
usb4 at ehci0: USB revision 2.0
ppb1 at pci0 dev 30 function 0: vendor 0x8086 product 0x244e (rev. 0xc2)
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled
bktr0 at pci2 dev 0 function 0
bktr0: interrupting at ioapic0 pin 21
bktr0: Warning - card vendor 0x0000 (model 0x0000) unknown.
bktr0: Detected a DPL34-1@-@0 at 0x84
bktr0: Intel Smart Video III/VideoLogic Captivator PCI, <no> tuner, dpl3518a 
dolby.
vendor 0x109e product 0x0878 (miscellaneous multimedia, revision 0x11) at pci2 
dev 0 function 1 not configured
adv1 at pci2 dev 1 function 0: AdvanSys ABP-9xxUA SCSI adapter
adv1: interrupting at ioapic0 pin 22
scsibus0 at adv1: 8 targets, 8 luns per target
fwohci0 at pci2 dev 2 function 0: vendor 0x11c1 product 0x5811 (rev. 0x61)
fwohci0: interrupting at ioapic0 pin 17
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channels is 8.
fwohci0: EUI64 30:bd:05:02:00:00:1a:7e
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
ieee1394if0 at fwohci0: IEEE1394 bus
fwip0 at ieee1394if0: IP over IEEE1394
fwohci0: Initiate bus reset
fxp0 at pci2 dev 8 function 0: Intel PRO/100 VM Network Controller with 
82562ET/EZ PHY (rev. 0x01)
fxp0: interrupting at ioapic0 pin 20
fxp0: Ethernet address 00:0c:f1:6c:45:cb
inphy0 at fxp0 phy 1: i82562ET 10/100 media interface, rev. 0
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ichlpcib0 at pci0 dev 31 function 0: vendor 0x8086 product 0x24d0 (rev. 0x02)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
ichlpcib0: TCO (watchdog) timer configured.
gpio0 at ichlpcib0: 64 pins
fwhrng0 at ichlpcib0: Intel Firmware Hub Random Number Generator
piixide0 at pci0 dev 31 function 1: Intel 82801EB IDE Controller (ICH5) (rev. 
0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14
atabus0 at piixide0 channel 0
piixide0: secondary channel configured to compatibility mode
piixide0: secondary channel interrupting at ioapic0 pin 15
atabus1 at piixide0 channel 1
piixide1 at pci0 dev 31 function 2: Intel 82801EB Serial ATA Controller (rev. 
0x02)
piixide1: bus-master DMA support present
piixide1: primary channel configured to native-PCI mode
piixide1: using ioapic0 pin 18 for native-PCI interrupt
atabus2 at piixide1 channel 0
piixide1: secondary channel configured to native-PCI mode
atabus3 at piixide1 channel 1
ichsmb0 at pci0 dev 31 function 3: vendor 0x8086 product 0x24d3 (rev. 0x02)
ichsmb0: interrupting at ioapic0 pin 17
iic0 at ichsmb0: I2C bus
auich0 at pci0 dev 31 function 5: i82801EB (ICH5) AC-97 Audio
auich0: interrupting at ioapic0 pin 17
auich0: ac97: Analog Devices AD1985 codec; headphone, 20 bit DAC, no 3D stereo
auich0: ac97: ext id 0x3c7<AMAP,LDAC,SDAC,CDAC,SPDIF,DRA,VRA>
isa0 at ichlpcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat   0 us, pow     0 mW
acpicpu0: T0: I/O, lat   1 us, pow     0 mW, 100 %
acpicpu0: T1: I/O, lat   1 us, pow     0 mW,  88 %
acpicpu0: T2: I/O, lat   1 us, pow     0 mW,  76 %
acpicpu0: T3: I/O, lat   1 us, pow     0 mW,  64 %
acpicpu0: T4: I/O, lat   1 us, pow     0 mW,  52 %
acpicpu0: T5: I/O, lat   1 us, pow     0 mW,  40 %
acpicpu0: T6: I/O, lat   1 us, pow     0 mW,  28 %
acpicpu0: T7: I/O, lat   1 us, pow     0 mW,  16 %
acpicpu1 at cpu1: ACPI CPU
fwohci0: BUS reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
ieee1394if0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me)
ieee1394if0: bus manager 0
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
scsibus0: waiting 2 seconds for devices to settle...
auich0: measured ac97 link rate at 48000 Hz
audio0 at auich0: full duplex, playback, capture, mmap, independent
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
uhub0 at usb1: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb0: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb3: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhub3 at usb2: vendor 0x8086 UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
wd0 at atabus0 drive 0
uhub4 at usb4: vendor 0x8086 EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
wd0: <Maxtor 6L080L0>
wd0: drive supports 16-sector PIO transfers, LBA addressing
wd0: 78167 MB, 158816 cyl, 16 head, 63 sec, 512 bytes/sect x 160086528 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133)
wd1 at atabus0 drive 1
wd1: <FUJITSU MPE3064AT>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 6187 MB, 13410 cyl, 15 head, 63 sec, 512 bytes/sect x 12672450 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 4 (Ultra/66)
wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA)
wd1(piixide0:0:1): using PIO mode 4, Ultra-DMA mode 4 (Ultra/66) (using DMA)
atapibus0 at atabus1: 2 targets
cd0 at atapibus0 drive 0: <CD-RW BCE1610IM, , VER A.2> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd1 at atapibus0 drive 1: <HL-DT-ST DVDRAM GSA-4167B, 00DA7A5F5015, DL13> cdrom 
removable
cd1: 32-bit data port
cd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd0(piixide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
cd1(piixide0:1:1): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
umass0 at uhub4 port 5 configuration 1 interface 0
umass0: Generic Mass Storage Device, rev 2.00/1.00, addr 2
umass0: using SCSI over Bulk-Only
scsibus1 at umass0: 2 targets, 1 lun per target
sd0 at scsibus1 target 0 lun 0: <Generic, Storage Device, 0.00> disk removable
sd0: drive offline
sd0: unable to open device, error = 19
uscanner0 at uhub1 port 2
uscanner0: vendor 0x055f product 0x0006, rev 1.00/1.00, addr 2
Kernelized RAIDframe activated
cprng sysctl: WARNING insufficient entropy at creation.
findroot: unable to read block 80035831 of dev wd1 (22)
opendisk: can't open dev sd0 (19)
opendisk: can't open dev sd0 (19)
boot device: wd0
root on wd0a dumps on wd0b
mountroot: trying smbfs...
mountroot: trying ntfs...
mountroot: trying nfs...
mountroot: trying msdos...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
init: copying out path `/sbin/init' 11
uhidev0 at uhub2 port 2 configuration 1 interface 0
uhidev0: Logitech USB Receiver, rev 2.00/22.00, addr 2, iclass 3/1
ums0 at uhidev0: 16 buttons, W and Z dirs
wsmouse0 at ums0 mux 0
uhidev1 at uhub2 port 2 configuration 1 interface 1
uhidev1: Logitech USB Receiver, rev 2.00/22.00, addr 2, iclass 3/0
uhidev1: 17 report ids
uhid0 at uhidev1 reportid 3: input=4, output=0, feature=0
uhid1 at uhidev1 reportid 16: input=6, output=6, feature=0
uhid2 at uhidev1 reportid 17: input=19, output=19, feature=0
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)

Two older and slightly slower, but otherwise similar machines were 
upgraded to RC1 in the same way work fine and are in production as
web servers.

>How-To-Repeat:
Compile a standard RC1 GENERIC kernel on this particular system. Try and boot 
it,
watch the machine reboot.       
>Fix:
Use a kernel with options DIAGNOSTIC or options DEBUG defined.

I will try and go back to an early snapshot of 6 and try compiling a GENERIC
kernel with DIAGNOSTIC commented out and see if that exhibits the same symptoms.



Home | Main Index | Thread Index | Old Index