Subject: strange NMI, apparently from pciide....
To: NetBSD/i386 Discussion List <port-i386@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: port-i386
Date: 07/04/2001 15:03:48
I've got this P-Pro ATX motherboard (designated P6FX1-A, apparently made
by Elitegroup Computer Systems of Taiwan).  It has an Intel 440FX PCIset.

It seems to have trouble probing one of the el-cheapo IDE drives I've
got attached and at about the time it first accesses wd1 it gets a DMA
error and an un-specified NMI drops it to the debugger during the boot.
If I "continue" all is well.

It's also got ECC memory in it too, and the SIMMs were scounged from the
junk bin at a local surplus shop, so even though they pass the
apparently extensive BIOS memory checks I'm not 100% certain of their
reliability.  There haven't ever been any subsequent NMIs though, and
I've done package builds using wd1 (and it's run /etc/daily now).

Is there any way to get more details on this NMI so that I can know for
sure where it might have come from?  If this NMI is from the IDE
controller shouldn't it be handled cleanly and not cause a drop to DDB?

Here are the boot messages, including the NMI and related wd1 errors:

NetBSD 1.5W (GENERIC) #14: Tue Jul  3 12:59:36 EDT 2001
    woods@proven:/backups/NetBSD-obj.i386/arch/i386/compile/GENERIC
cpu0: Intel Pentium Pro (686-class), 199.34 MHz
cpu0: I-cache 8 KB 32b/line 4-way, D-cache 8 KB 32b/line 2-way
cpu0: L2 cache 256 KB 32b/line 4-way
cpu0: features f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features f9ff<PGE,MCA,CMOV>
total memory = 81532 KB
avail memory = 70156 KB
using 1044 buffers containing 4176 KB of memory
BIOS32 rev. 0 found at 0xfb3a0
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82441FX PCI and Memory Controller (PMC) (rev. 0x02)
pcib0 at pci0 dev 7 function 0
pcib0: Intel 82371SB PCI-to-ISA Bridge (PIIX3) (rev. 0x01)
pciide0 at pci0 dev 7 function 1: Intel 82371SB IDE Interface (PIIX3) (rev. 0x00)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
wd0 at pciide0 channel 0 drive 0: <QUANTUM SIROCCO1700A>
wd0: drive supports 8-sector PIO transfers, LBA addressing
wd0: 1628 MB, 3309 cyl, 16 head, 63 sec, 512 bytes/sect x 3335472 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2
pciide0: primary channel interrupting at irq 14
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
pciide0: secondary channel wired to compatibility mode
atapibus0 at pciide0 channel 1: 2 targets
cd0 at atapibus0 drive 1: <NEC                 CD-ROM DRIVE:284, , 3.03> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 3, DMA mode 1
wd1 at pciide0 channel 1 drive 0: <SAMSUNG WA32163A>
wd1: drive supports 16-sector PIO transfers, LBA addressing
wd1: 2062 MB, 4190 cyl, 16 head, 63 sec, 512 bytes/sect x 4223520 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2
pciide0: secondary channel interrupting at irq 15
wd1(pciide0:1:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
cd0(pciide0:1:1): using PIO mode 0, DMA mode 1 (using DMA data transfers)
vga1 at pci0 dev 9 function 0: S3 Trio32/64 (rev. 0x00)
wsdisplay0 at vga1
de0 at pci0 dev 11 function 0
de0: interrupting at irq 10
de0: SMC 21041 [10Mb/s] pass 1.1
de0: address 00:00:c0:83:c3:e9
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0
lpt0 at isa0 port 0x378-0x37b irq 7
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
apm0 at mainbus0: Power Management spec V1.2
APM power mgmt engage (device 1): power management disabled (0x10f)
apm0: A/C state: on
apm0: battery charge state: no battery
biomask fb65 netmask ff65 ttymask ffe7
de0: enabling 10baseT port
NMI ... going to debugger
pciide0:1:0: lost interrupt
	type: ata tc_bcount: 512 tc_skip: 0
pciide0:1:0: bus-master DMA error: missing interrupt, status=0x60
wd1: transfer error, downgrading to PIO mode 4
wd1(pciide0:1:0): using PIO mode 4
cd0(pciide0:1:1): using PIO mode 0, DMA mode 1 (using DMA data transfers)
wd1d: DMA error reading fsbn 0 (wd1 bn 0; cn 0 tn 0 sn 0), retrying
wd1: soft error (corrected)
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
de0: enabling 10baseT port
wsdisplay0: screen 0 added (80x25, vt100 emulation)
wsdisplay0: screen 1 added (80x50, vt100 emulation)
wsdisplay0: screen 2 added (80x50, vt100 emulation)
wsdisplay0: screen 3 added (80x50, vt100 emulation)
wsdisplay0: screen 4 added (80x50, vt100 emulation)
wsdisplay0: screen 5 added (80x50, vt100 emulation)
wsdisplay0: screen 6 added (80x50, vt100 emulation)
wsdisplay0: screen 7 added (80x50, vt100 emulation)
wskbd0: connecting to wsdisplay0
wsmux1: connecting to wsdisplay0


(BTW, I think the BIOS32 revision number shown above is wrong)

(Oh, and if anyone locally happens to have an unused P-Pro-200 chip with
a larger L2 cache lying around....  :-)

-- 
							Greg A. Woods

+1 416 218-0098      VE3TCP      <gwoods@acm.org>     <woods@robohack.ca>
Planix, Inc. <woods@planix.com>;   Secrets of the Weird <woods@weird.com>