Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: 8.99.9 hangs



Thomas Klausner <tk%giga.or.at@localhost> writes:

> After updating to 8.99.9 I've experienced strange hangs. The keyboard
> and mouse don't work any longer, and it doesn't react to the power
> button, so I have to reset.

Same here.  It was really bad with a version from about a week ago, but
after updating on the 19th, so I got the changes from ozaki-r@ related
to multiprocessor safety, it got much better.  Still happens, though, on
the system I'm attaching dmesg.boot for.

The hang is hard enough that hitting the NMI switch doesn't do anything,
which is interesting.

And while on that topic: the current handling of NMI on the amd64
multiprocessor platform seems not quite right: we get output from each
processor saying that it's responding to the interrupt, and continuing
afterwards doesn't work, either.  I've played with it a bit, and have
something that at least lets just one CPU actually handle the NMI, and
where continuing works right.  A new NMI after resuming doesn't have any
effect, though, so I guess the non-maskable interrupt is.  :)

If someone who knows how this stuff actually works would like to look at
the code, and what I've done with it, I'm attaching my current diff.
I won't be surprised if I'm doing this all wrong...

-tih
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 8.99.9 (BARSOOM) #29: Sun Dec 24 11:05:05 CET 2017
	root%barsoom.hamartun.priv.no@localhost:/usr/obj/sys/arch/amd64/compile.amd64/BARSOOM
total memory = 8191 MB
avail memory = 7931 MB
timecounter: Timecounters tick every 10.000 msec
Kernelized RAIDframe activated
running cgd selftest aes-xts-256 aes-xts-512 done
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Computer Corporation PowerEdge 2850
mainbus0 (root)
ACPI: RSDP 0x00000000000FD5B0 000014 (v00 DELL  )
ACPI: RSDT 0x00000000000FD5C4 000038 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: FACP 0x00000000000FD620 000074 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: DSDT 0x00000000BFFC0000 003CCD (v01 DELL   PE BKC   00000001 MSFT 0100000E)
ACPI: FACS 0x00000000BFFCFC00 000040
ACPI: APIC 0x00000000000FD694 0000E0 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: SPCR 0x00000000000FD774 000050 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: HPET 0x00000000000FD7C4 000038 (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: MCFG 0x00000000000FD7FC 00003C (v01 DELL   PE BKC   00000001 MSFT 0100000A)
ACPI: 1 ACPI AML tables successfully acquired and loaded
ioapic0 at mainbus0 apid 8: pa 0xfec00000, version 0x20, 24 pins
ioapic1 at mainbus0 apid 9: pa 0xfec80000, version 0x20, 24 pins
ioapic2 at mainbus0 apid 10: pa 0xfec83000, version 0x20, 24 pins
ioapic3 at mainbus0 apid 11: pa 0xfec84000, version 0x20, 24 pins
cpu0 at mainbus0 apid 0
cpu0: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu0: package 0, core 0, smt 0
cpu1 at mainbus0 apid 6
cpu1: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu1: package 3, core 0, smt 0
cpu2 at mainbus0 apid 1
cpu2: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu2: package 0, core 0, smt 1
cpu3 at mainbus0 apid 7
cpu3: Intel(R) Xeon(TM) CPU 3.00GHz, id 0xf43
cpu3: package 3, core 0, smt 1
acpi0 at mainbus0: Intel ACPICA 20171110
acpi0: X/RSDT: OemId <DELL  ,PE BKC  ,00000001>, AslId <MSFT,0100000a>
acpi0: MCFG: segment 0, bus 0-255, address 0x00000000e0000000
acpi0: SCI interrupting at int 9
timecounter: Timecounter "ACPI-Fast" frequency 3579545 Hz quality 1000
hpet0 at acpi0: high precision event timer (mem 0xfed00000-0xfed00400)
timecounter: Timecounter "hpet0" frequency 14318180 Hz quality 2000
pcppi1 at acpi0 (SPK, PNP0800): io 0x61
spkr0 at pcppi1: PC Speaker
wsbell at spkr0 not configured
midi0 at pcppi1: PC speaker
sysbeep0 at pcppi1
attimer1 at acpi0 (TMR, PNP0100): io 0x40-0x5f irq 0
FDC (PNP0700) at acpi0 not configured
COMA (PNP0501) at acpi0 not configured
MBIO (PNP0C01) at acpi0 not configured
NIPM (IPI0001) at acpi0 not configured
acpivga0 at acpi0 (EVGA): ACPI Display Adapter
PEHB (PNP0C02) at acpi0 not configured
ACPI: Enabled 1 GPEs in block 00 to 1F
attimer1: attached to pcppi1
ipmi0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 8086 product 3590 (rev. 0x09)
ppb0 at pci0 dev 2 function 0: vendor 8086 product 3595 (rev. 0x09)
ppb0: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x8 @ 2.5GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci1 dev 0 function 0: vendor 8086 product 0330 (rev. 0x06)
ppb1: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
amr0 at pci2 dev 14 function 0: AMI RAID <PERC 4e/Di>
amr0: interrupting at ioapic1 pin 14
amr0: firmware 5B2D, BIOS H435, 256MB RAM
ld0 at amr0 unit 0: RAID 1, optimal
ld0: 69880 MB, 8908 cyl, 255 head, 63 sec, 512 bytes/sect x 143114240 sectors
ld1 at amr0 unit 1: RAID 1, optimal
ld1: 69880 MB, 8908 cyl, 255 head, 63 sec, 512 bytes/sect x 143114240 sectors
ld2 at amr0 unit 2: RAID 1, optimal
ld2: 136 GB, 17834 cyl, 255 head, 63 sec, 512 bytes/sect x 286515200 sectors
ppb2 at pci1 dev 0 function 2: vendor 8086 product 0332 (rev. 0x06)
ppb2: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
ppb3 at pci0 dev 4 function 0: vendor 8086 product 3597 (rev. 0x09)
ppb3: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x8 @ 2.5GT/s
pci4 at ppb3 bus 4
pci4: i/o space, memory space enabled, rd/line, wr/inv ok
ppb4 at pci0 dev 5 function 0: vendor 8086 product 3598 (rev. 0x09)
ppb4: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x4 @ 2.5GT/s
pci5 at ppb4 bus 5
pci5: i/o space, memory space enabled, rd/line, wr/inv ok
ppb5 at pci5 dev 0 function 0: vendor 8086 product 0329 (rev. 0x09)
ppb5: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci6 at ppb5 bus 6
pci6: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci6 dev 7 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05)
wm0: interrupting at ioapic2 pin 0
wm0: 32-bit 66MHz PCI bus
wm0: 512 words (16 address bits) SPI EEPROM
wm0: Ethernet address 00:13:72:f7:00:06
wm0: 0x20442<LOCK_EECD,SPI,IOH_VALID,ASF_FIRM>
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb6 at pci5 dev 0 function 2: vendor 8086 product 032a (rev. 0x09)
ppb6: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci7 at ppb6 bus 7
pci7: i/o space, memory space enabled, rd/line, wr/inv ok
wm1 at pci7 dev 8 function 0: Intel i82541GI 1000BASE-T Ethernet (rev. 0x05)
wm1: interrupting at ioapic2 pin 1
wm1: 32-bit 66MHz PCI bus
wm1: 256 words (16 address bits) SPI EEPROM
wm1: Ethernet address 00:13:72:f7:00:07
wm1: 0x20442<LOCK_EECD,SPI,IOH_VALID,ASF_FIRM>
igphy1 at wm1 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
ppb7 at pci0 dev 6 function 0: vendor 8086 product 3599 (rev. 0x09)
ppb7: PCI Express capability version 1 <Root Port of PCI-E Root Complex> x8 @ 2.5GT/s
pci8 at ppb7 bus 8
pci8: i/o space, memory space enabled, rd/line, wr/inv ok
ppb8 at pci8 dev 0 function 0: vendor 8086 product 0329 (rev. 0x09)
ppb8: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci9 at ppb8 bus 9
pci9: i/o space, memory space enabled, rd/line, wr/inv ok
ppb9 at pci8 dev 0 function 2: vendor 8086 product 032a (rev. 0x09)
ppb9: PCI Express capability version 1 <PCI-E to PCI/PCI-X Bridge>
pci10 at ppb9 bus 10
pci10: i/o space, memory space enabled, rd/line, wr/inv ok
fxp0 at pci10 dev 2 function 0: i82559 Ethernet (rev. 0x08)
fxp0: interrupting at ioapic3 pin 0
fxp0: Ethernet address 00:90:27:44:4e:d0
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
uhci0 at pci0 dev 29 function 0: vendor 8086 product 24d2 (rev. 0x02)
uhci0: interrupting at ioapic0 pin 16
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 29 function 1: vendor 8086 product 24d4 (rev. 0x02)
uhci1: interrupting at ioapic0 pin 19
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 29 function 2: vendor 8086 product 24d7 (rev. 0x02)
uhci2: interrupting at ioapic0 pin 18
usb2 at uhci2: USB revision 1.0
ehci0 at pci0 dev 29 function 7: vendor 8086 product 24dd (rev. 0x02)
ehci0: interrupting at ioapic0 pin 23
ehci0: EHCI version 1.0
ehci0: 3 companion controllers, 2 ports each: uhci0 uhci1 uhci2
usb3 at ehci0: USB revision 2.0
ppb10 at pci0 dev 30 function 0: vendor 8086 product 244e (rev. 0xc2)
pci11 at ppb10 bus 11
pci11: i/o space, memory space enabled
vendor 1028 product 0011 (undefined, subclass 0x00) at pci11 dev 5 function 0 not configured
vendor 1028 product 0012 (undefined, subclass 0x00) at pci11 dev 5 function 1 not configured
vendor 1028 product 0014 (undefined, subclass 0x00) at pci11 dev 5 function 2 not configured
cmdide0 at pci11 dev 6 function 0: Silicon Image 0680 (rev. 0x02)
cmdide0: bus-master DMA support present
cmdide0: primary channel wired to native-PCI mode
cmdide0: using ioapic0 pin 23 for native-PCI interrupt
atabus0 at cmdide0 channel 0
cmdide0: secondary channel wired to native-PCI mode
atabus1 at cmdide0 channel 1
vga0 at pci11 dev 13 function 0: vendor 1002 product 5159 (rev. 0x00)
wsdisplay0 at vga0 kbdmux 1
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
ichlpcib0 at pci0 dev 31 function 0: vendor 8086 product 24d0 (rev. 0x02)
timecounter: Timecounter "ichlpcib0" frequency 3579545 Hz quality 1000
ichlpcib0: 24-bit timer
tco0 at ichlpcib0: TCO (watchdog) timer configured.
tco0: Min/Max interval 2/37 seconds
piixide0 at pci0 dev 31 function 1: Intel 82801EB IDE Controller (ICH5) (rev. 0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14
atabus2 at piixide0 channel 0
piixide0: secondary channel configured to compatibility mode
piixide0: secondary channel interrupting at ioapic0 pin 15
atabus3 at piixide0 channel 1
isa0 at ichlpcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
pckbc0 at isa0 port 0x60-0x64
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
acpicpu0 at cpu0: ACPI CPU
acpicpu0: C1: HLT, lat   0 us, pow     0 mW
acpicpu1 at cpu1: ACPI CPU
acpicpu2 at cpu2: ACPI CPU
acpicpu3 at cpu3: ACPI CPU
timecounter: Timecounter "clockinterrupt" frequency 100 Hz quality 0
IPsec: Initialized Security Association Processing.
uhub0 at usb1: vendor 8086 (0x8086) UHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub1 at usb2: vendor 8086 (0x8086) UHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub2 at usb0: vendor 8086 (0x8086) UHCI root hub (0000), class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhub3 at usb3: vendor 8086 (0x8086) EHCI root hub (0000), class 9/0, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
atapibus0 at atabus0: 2 targets
sd0 at atapibus0 drive 0: <VIRTUALFLOPPY DRIVE               Floppy, , > disk removable
sd0(cmdide0:0:0): preposterous sector size: 0x0.  Defaulting to 512 bytes.
sd0: fabricating a geometry
sd0: 512, 0 cyl, 64 head, 32 sec, 512 bytes/sect x 1 sectors
sd0(cmdide0:0:0): preposterous sector size: 0x0.  Defaulting to 512 bytes.
sd0: fabricating a geometry
sd0: 32-bit data port
sd0: drive supports PIO mode 3cd0 at atapibus0 drive 1: <VIRTUALCDROM DRIVE, , > cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 3sd0(cmdide0:0:0): using PIO mode 3
cd0(cmdide0:0:1): using PIO mode 3
atapibus1 at atabus2: 2 targets
cd1 at atapibus1 drive 0: <HL-DT-ST  GCR-8240N, , 1.10> cdrom removable
cd1: 32-bit data port
cd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2 (Ultra/33)
cd1(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA)
ehci0: handing over full speed device on port 1 to uhci0
uhub4 at uhub3 port 3: vendor 413c (0x413c) product a001 (0xa001), class 9/0, rev 2.00/0.00, addr 2
uhub4: multiple transaction translators
uhub4: 2 ports with 2 removable, self powered
uhidev0 at uhub2 port 1 configuration 1 interface 0
uhidev0: Dell (0x413c) DRAC4 (0x2500), rev 1.10/0.00, addr 2, iclass 3/1
ukbd0 at uhidev0: 8 Variable keys, 6 Array codes
ehci0: handing over full speed device on port 5 to uhci2
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev1 at uhub2 port 1 configuration 1 interface 1
uhidev1: Dell (0x413c) DRAC4 (0x2500), rev 1.10/0.00, addr 2, iclass 3/1
ums0 at uhidev1: 3 buttons and Z dir
wsmouse0 at ums0 mux 0
boot device: ld0
root on ld0a dumps on ld0b
root file system type: ffs
kern.module.path=/stand/amd64/8.99.9/modules
/var: replaying log to disk
/usr: replaying log to disk
/var/pgsql/data: replaying log to disk
uplcom0 at uhub1 port 1
uplcom0: ATEN International (0x557) Serial adapter (0x2008), rev 1.10/0.01, addr 2
/usr/local: replaying log to disk
ucom0 at uplcom0
ipmi0: version 1.5 interface KCS iobase 0xca8/0x8 spacing 4
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
/u: replaying log to disk
RCS file: /cvsroot/src/sys/arch/amd64/amd64/db_interface.c,v
retrieving revision 1.27
diff -u -u -r1.27 db_interface.c
--- sys/arch/amd64/amd64/db_interface.c	15 Aug 2017 09:08:39 -0000	1.27
+++ sys/arch/amd64/amd64/db_interface.c	14 Dec 2017 19:10:39 -0000
@@ -189,12 +189,12 @@
 kdb_trap(int type, int code, db_regs_t *regs)
 {
 	int s;
+#ifdef MULTIPROCESSOR
 	db_regs_t dbreg;
+#endif
 
 	switch (type) {
 	case T_NMI:	/* NMI */
-		printf("NMI ... going to debugger\n");
-		/*FALLTHROUGH*/
 	case T_BPTFLT:	/* breakpoint */
 	case T_TRCTRAP:	/* single_step */
 	case -1:	/* keyboard interrupt */
@@ -214,34 +214,35 @@
 	if (!db_suspend_others()) {
 		ddb_suspend(regs);
 	} else {
-	curcpu()->ci_ddb_regs = &dbreg;
-	ddb_regp = &dbreg;
+		curcpu()->ci_ddb_regs = &dbreg;
+		ddb_regp = &dbreg;
 #endif
+		if (type == T_NMI)
+			printf("NMI received; going to debugger\n");
+
+		ddb_regs = *regs;
+		ddb_regs.tf_cs &= 0xffff;
+		ddb_regs.tf_ds &= 0xffff;
+		ddb_regs.tf_es &= 0xffff;
+		ddb_regs.tf_fs &= 0xffff;
+		ddb_regs.tf_gs &= 0xffff;
+		ddb_regs.tf_ss &= 0xffff;
+
+		s = splhigh();
+		db_active++;
+		cnpollc(true);
+		db_trap(type, code);
+		cnpollc(false);
+		db_active--;
+		splx(s);
 
-	ddb_regs = *regs;
+		*regs = ddb_regs;
 
-	ddb_regs.tf_cs &= 0xffff;
-	ddb_regs.tf_ds &= 0xffff;
-	ddb_regs.tf_es &= 0xffff;
-	ddb_regs.tf_fs &= 0xffff;
-	ddb_regs.tf_gs &= 0xffff;
-	ddb_regs.tf_ss &= 0xffff;
-
-	s = splhigh();
-	db_active++;
-	cnpollc(true);
-	db_trap(type, code);
-	cnpollc(false);
-	db_active--;
-	splx(s);
 #ifdef MULTIPROCESSOR
-	db_resume_others();
+		ddb_regp = &dbreg;
+		db_resume_others();
 	}
-#endif
-	ddb_regp = &dbreg;
-
-	*regs = ddb_regs;
-
+#endif  
 	return (1);
 }
 
-- 
Most people who graduate with CS degrees don't understand the significance
of Lisp.  Lisp is the most important idea in computer science.  --Alan Kay


Home | Main Index | Thread Index | Old Index