Subject: Re: kern/13827: fatal page fault in supervisor mode and hang in
To: None <current-users@netbsd.org>
From: Andreas Wrede <andreas@planix.com>
List: current-users
Date: 09/02/2001 10:45:27
Could someone please take a look at the above pr? I tried to send a
followup (see below) to gnats-bugs but it does not show op in the
database.

Briefly, the problem is that a GENERIC 1.5.1 kernel on a stock Compaq
Proliant machine will panic in the siop interrupt routine under high
I/O load, like during /etc/daily.

Thanks.
-- 
    - aew

---------- Forwarded message ----------
Date: Fri, 31 Aug 2001 08:27:20 -0400 (EDT)
From: Andreas Wrede <andreas@planix.com>
To: gnats-bugs@netbsd.org
Subject: Re: kern/13827

In order to narrow down the source of the problem, I replaced the
customer kernel with a GENERIC kernel:

NetBSD 1.5.1 (GENERIC) #56: Mon Jul  2 15:54:23 CEST 2001
    he@nsa.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC

The panic re-occured while running /etc/daily. This time it gets hung
after printing 'kernel: fault trap, code=0' the remainder of the trap
line (ie  eip xxxxxxxx cs x eflags xxxx...) is missing. Also missing
is the line 'fatal page fault in supervisor mode'.

kernel: page fault trap, code=0
Stopped at      lockmgr+0x78:   movl          0x3c(%ecx),%ecx
db>
db> trace
lockmgr(c0545024,10012,c05450a8) at lockmgr+0x78
uvm_map(c0545020,d25a1a80,1000,c0544fc0,ffffffff) at uvm_map+0x79
uvm_km_valloc(c0545020,1000,c0531da0,c0b12420,c0b09b00) at uvm_km_valloc+0x37
_bus_dmamem_map(c0531da0,d25a1af4,1,1000,c0b1242c) at _bus_dmamem_map+0x2e
siop_morecbd(c0a72c00) at siop_morecbd+0xf9
siop_scsicmd(c0a9065c) at siop_scsicmd+0x52
scsipi_execute_xs(c0a9065c,0,1009,c0a70200,d25a1bac) at scsipi_execute_xs+0x36
scsi_scsipi_cmd(c0a70200,d25a1c00,a,c524a000,2000) at scsi_scsipi_cmd+0xd3
scsipi_command(c0a70200,d25a1c00,a,c524a000,2000) at scsipi_command+0x59
sdstart(c0a8a800,d000fdc4,c501ecc0,d25a1c40,c0308b9f) at sdstart+0x1ea
scsipi_free_xs(c0a9065c,1) at scsipi_free_xs+0x8b
scsipi_done(c0a9065c,c0a72c00,ff00,1,1009) at scsipi_done+0x123
siop_scsicmd_end(c0a8ba00,c0a7ee80,d25924bc,d25924bc,c0a72c00) at siop_scsicmd_end+0x35d
siop_intr(c0a72c00) at siop_intr+0x1370
Xintr10() at Xintr10+0x7c
--- interrupt ---
idle(d25924bc) at idle+0x1c
bpendtsleep(c4fe26e8,11,c041de7a,0,0) at bpendtsleep
biowait(c4fe26e8,40,d25b8b0c,c0a9d000,c04ebc20) at biowait+0x31
bread(d259a8f0,40,2000,ffffffff,d25a1df8) at bread+0x95
ffs_update(d25a1e2c,d25b9288,d25a1ef4,d25aa1a4,0) at ffs_update+0x1bc
ffs_full_fsync(d25a1ef4,d25b9288,d25a1ef4,d25aa1a4,d28b8330) at ffs_full_fsync+0x224
ffs_fsync(d25a1ef4) at ffs_fsync+0x3a
ffs_sync(c0a9f200,3,c0a70f80,d25924bc) at ffs_sync+0xf3
sync_fsync(d25a1f68) at sync_fsync+0x53
db> sync
syncing disks... tl1: receiver ring buffer overrun
tl0: receiver ring buffer overrun
Stopped at      cpu_Debugger+0x4:       leave
db> sync

dumping to dev 4,1 offset 500487
dump
[here the machine looks up hard].

For completness, here is the boot log:

>> NetBSD/i386 BIOS Boot, Revision 2.7
>> (he@nsa.uninett.no, Mon Jun 18 01:32:10 CEST 2001)
>> Memory: 639/261120 k
Use hd1a:netbsd to boot sd0 when wd0 is also installed
Press return to boot now, any other key for boot menu
booting wd0a:netbsd - starting in 0
4057080+386684+316588 [65+245488+200220]=0x4f823c
[ preserving 446228 bytes of netbsd ELF symbol table ]
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 1.5.1 (GENERIC) #56: Mon Jul  2 15:54:23 CEST 2001
    he@nsa.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium III (Katmai) (686-class), 498.90 MHz
total memory = 255 MB
avail memory = 231 MB
using 3297 buffers containing 13188 KB of memory
BIOS32 rev. 0 found at 0xf0000
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82443BX Host Bridge/Controller (AGP disabled) (rev. 0x02)
vga1 at pci0 dev 11 function 0: Cirrus Logic CL-GD5446 (rev. 0x45)
wsdisplay0 at vga1
ppb0 at pci0 dev 13 function 0: Digital Equipment DECchip 21150 PCI-PCI Bridge (rev. 0x04)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
tl0 at pci1 dev 7 function 0
tl0: Compaq ProLiant Integrated Netelligent 10/100 TX
tl0: Ethernet address 00:50:8b:2c:e5:94
tl0: interrupting at irq 9
nsphy0 at tl0 phy 1: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlphy0 at tl0 phy 31: ThunderLAN 10baseT media interface, rev. 5
tlphy0: 10base2
siop0 at pci1 dev 9 function 0: Symbios Logic 53c875 (ultra-wide scsi)
siop0: using on-board RAM
siop0: interrupting at irq 10
scsibus0 at siop0: 16 targets, 8 luns per target
siop1 at pci1 dev 9 function 1: Symbios Logic 53c875 (ultra-wide scsi)
siop1: using on-board RAM
siop1: interrupting at irq 11
scsibus1 at siop1: 16 targets, 8 luns per target
tl1 at pci1 dev 11 function 0
tl1: Compaq Netelligent 10/100 TX
tl1: Ethernet address 00:80:5f:31:2e:81
tl1: interrupting at irq 5
nsphy1 at tl1 phy 1: DP83840 10/100 media interface, rev. 1
nsphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
tlphy1 at tl1 phy 31: ThunderLAN 10baseT media interface, rev. 5
tlphy1: no media present
Compaq product 0xa0f0 (miscellaneous system) at pci0 dev 14 function 0 not configured
pcib0 at pci0 dev 20 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
pciide0 at pci0 dev 20 function 1: Intel 82371AB IDE controller (PIIX4) (rev. 0x
01)
pciide0: bus-master DMA support present
pciide0: primary channel wired to compatibility mode
atapibus0 at pciide0 channel 0
cd0 at atapibus0 drive 0: <CD-ROM CDU701-Q, , 1.0r> type 5 cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
pciide0: primary channel interrupting at irq 14
cd0(pciide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
pciide0: secondary channel wired to compatibility mode
pciide0: secondary channel ignored (disabled)
Uhci0 at pci0 dev 20 function 2: Intel 82371AB USB Host Controller (PIIX4) (rev. 0x01)
uhci0: can't map i/o space
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge, revision 0x02) at pci0 dev 20 function 3 not configured
isa0 at pcib0
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
isapnp0: no ISA Plug 'n Play devices found
biomask fdc5 netmask ffe5 ttymask ffe7
scsibus0: waiting 2 seconds for devices to settle...
siop0: target 0 using tagged queuing
sd0 at scsibus0 target 0 lun 0: <COMPAQ, BD009122C6, B016> SCSI2 0/direct fixed
siop0: target 0 using 16bit transfers
siop0: target 0 now synchronous at 20.0Mhz, offset 16
sd0: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
siop0: target 1 using tagged queuing
sd1 at scsibus0 target 1 lun 0: <COMPAQ, BD009122C6, B016> SCSI2 0/direct fixed
siop0: target 1 using 16bit transfers
siop0: target 1 now synchronous at 20.0Mhz, offset 16
sd1: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
siop0: target 2 using tagged queuing
sd2 at scsibus0 target 2 lun 0: <COMPAQ, BD009122C6, B016> SCSI2 0/direct fixed
siop0: target 2 using 16bit transfers
siop0: target 2 now synchronous at 20.0Mhz, offset 16
sd2: 8678 MB, 5273 cyl, 20 head, 168 sec, 512 bytes/sect x 17773524 sectors
scsibus1: waiting 2 seconds for devices to settle...
siop1: target 6 using 8bit transfers
siop1: target 6 now synchronous at 10.0Mhz, offset 15
st0 at scsibus1 target 6 lun 0: <COMPAQ, DLT8000, 011A> SCSI2 1/sequential removable
st0: siop1: target 6 using 16bit transfers
siop1: target 6 now synchronous at 10.0Mhz, offset 15
drive empty
boot device: sd0
root on sd0a dumps on sd0b
root file system type: ffs


-- 
    - aew