Subject: kern/28291: panic: fatal page fault in supervisor mode in 2.0_RC5
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <andreas@planix.com>
List: netbsd-bugs
Date: 11/13/2004 16:47:01
>Number:         28291
>Category:       kern
>Synopsis:       panic: fatal page fault in supervisor mode in 2.0_RC5
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Nov 13 16:47:00 +0000 2004
>Originator:     Andreas Wrede <andreas@planix.com>
>Release:        NetBSD 2.0_RC5
>Organization:
Planix, Inc.
>Environment:
	
	
System: NetBSD willy.wrede.pvt 2.0_RC5 NetBSD 2.0_RC5 (PLANIX) #4: Sat Nov 13 10:17:06 EST 2004  root@willy.wrede.pvt:/u1/netbsd-2.0/obj/sys/arch/i386/compile.i386/PLANIX i386
Architecture: i386
Machine: i386
>Description:
	
I get a reproducable  fatal page fault in supervisor mode panic within 
10 minutes of starting an rsync operation of a large (10Gb) directory 
tree from one local fibre channel attached XServ RAID partition to 
another.

The machine is a dual CPU HP LPr. The fatal page fault occurs with either
an MP a UP kernel. This pr is bases on the UP kernel.

Note that a 'lockmgr: locking against myself' panic  occurs while 
'syncing disks...' after the first panic:

(gdb) bt
#0  0x0fef0000 in ?? ()
#1  0xc03d80e3 in cpu_reboot (howto=260, bootstr=0x0)
    at /u1/netbsd-2.0/src/sys/arch/i386/i386/machdep.c:745
#2  0xc0351d18 in panic (fmt=0xc06b3b80 "lockmgr: locking against myself")
    at /u1/netbsd-2.0/src/sys/kern/subr_prf.c:242
#3  0xc033556f in lockmgr (lkp=0xcc4248ec, flags=65554, interlkp=0xcc42487c)
    at /u1/netbsd-2.0/src/sys/kern/kern_lock.c:563
#4  0xc037ee65 in genfs_lock (v=0xc76b3930)
    at /u1/netbsd-2.0/src/sys/miscfs/genfs/genfs_vnops.c:324
#5  0xc037ddb0 in VOP_LOCK (vp=0xcc42487c, flags=65554)
    at /u1/netbsd-2.0/src/sys/kern/vnode_if.c:1082
#6  0xc037d409 in vn_lock (vp=0xcc42487c, flags=65554)
    at /u1/netbsd-2.0/src/sys/kern/vfs_vnops.c:782
#7  0xc0374a77 in vget (vp=0xcc42487c, flags=65554)
    at /u1/netbsd-2.0/src/sys/kern/vfs_subr.c:1247
#8  0xc02ef3d2 in ffs_sync (mp=<incomplete type>, waitfor=2, cred=0xc1285f80, 
    p=0xc7727330) at /u1/netbsd-2.0/src/sys/ufs/ffs/ffs_vfsops.c:1282
#9  0xc0377c2e in sys_sync (l=0xc7550b5c, v=0x0, retval=0x0)
    at /u1/netbsd-2.0/src/sys/kern/vfs_syscalls.c:616
#10 0xc0376333 in vfs_shutdown ()
    at /u1/netbsd-2.0/src/sys/kern/vfs_subr.c:2637
#11 0xc03d80f7 in cpu_reboot (howto=256, bootstr=0x0)
    at /u1/netbsd-2.0/src/sys/arch/i386/i386/machdep.c:731
#12 0xc0351d18 in panic (fmt=0xc0669972 "trap")
    at /u1/netbsd-2.0/src/sys/kern/subr_prf.c:242
#13 0xc03e2b71 in trap (frame=0xc76b3ae4)
    at /u1/netbsd-2.0/src/sys/arch/i386/i386/trap.c:296
#14 0xc0103015 in calltrap ()
#15 0xc037dfbe in VOP_BALLOC (vp=0xcc42487c, startoffset=81264640, size=16384, 
    cred=0xc1285f80, flags=0, bpp=0x0)
    at /u1/netbsd-2.0/src/sys/kern/vnode_if.c:1398
#16 0xc030fa0f in ufs_gop_alloc (vp=0xcc42487c, off=81264640, len=16384, 
    flags=0, cred=0xc1285f80)
    at /u1/netbsd-2.0/src/sys/ufs/ufs/ufs_vnops.c:2152
#17 0xc02f0920 in ffs_write (v=0xc76b3e24)
    at /u1/netbsd-2.0/src/sys/ufs/ufs/ufs_readwrite.c:349
#18 0xc037d968 in VOP_WRITE (vp=0xcc42487c, uio=0xc76b3ec4, ioflag=1, 
    cred=0xc1285f80) at /u1/netbsd-2.0/src/sys/kern/vnode_if.c:428
#19 0xc037cfb8 in vn_write (fp=0xc760849c, offset=0xc76084c4, uio=0xc76b3ec4, 
    cred=0xc1285f80, flags=1) at /u1/netbsd-2.0/src/sys/kern/vfs_vnops.c:564
#20 0xc0355aa9 in dofilewrite (p=0xc7727330, fd=1, fp=0xc760849c, 
    buf=0xa919000, nbyte=262144, offset=0xc76084c4, flags=1, retval=0xc76b3f5c)
    at /u1/netbsd-2.0/src/sys/kern/sys_generic.c:358
#21 0xc0355a0d in sys_write (l=0xc7550b5c, v=0xc76b3f64, retval=0xc76b3f5c)
    at /u1/netbsd-2.0/src/sys/kern/sys_generic.c:314
#22 0xc03e24ce in syscall_plain (frame=0xc76b3fa8)
    at /u1/netbsd-2.0/src/sys/arch/i386/i386/syscall.c:156


full boot messages:

NetBSD 2.0_RC5 (PLANIX) #4: Sat Nov 13 10:17:06 EST 2004
        root@willy.wrede.pvt:/u1/netbsd-2.0/obj/sys/arch/i386/compile.i386/PLANIX
total memory = 255 MB
avail memory = 242 MB
BIOS32 rev. 0 found at 0xfd83c
mainbus0 (root)
known mode 1 PCI chipset (71928086)
cpu0 at mainbus0: (uniprocessor)
cpu0: Intel Pentium III (686-class), 698.84 MHz, id 0x681
cpu0: features 387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 387fbff<PGE,MCA,CMOV,PAT,PSE36,PN,MMX>
cpu0: features 387fbff<FXSR,SSE>
cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu0: L2 cache 256 KB 32B/line 8-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: serial number 0000-0681-0003-9C42-8463-DD55
cpu0: 8 page colors
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82443BX Host Bridge/Controller (AGP disabled) (rev. 0x03)
pchb0: fixing Idle/Pipeline DRAM Leadoff Timing
pcib0 at pci0 dev 4 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
piixide0 at pci0 dev 4 function 1
piixide0: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
piixide0: bus-master DMA support present
piixide0: primary channel wired to compatibility mode
piixide0: primary channel interrupting at irq 14
atabus0 at piixide0 channel 0
piixide0: secondary channel wired to compatibility mode
piixide0: secondary channel ignored (disabled)
uhci0 at pci0 dev 4 function 2: Intel 82371AB USB Host Controller (PIIX4) (rev. 0x01)
uhci0: interrupting at irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous bridge, revision 0x02) at pci0 dev 4 function 3 not configured
ppb0 at pci0 dev 7 function 0: Digital Equipment DECchip 21152 PCI-PCI Bridge (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
skc0 at pci1 dev 3 function 0: irq 11
skc0: bad VPD resource id: expected 82 got ff
skc0: (null)
sk0 at skc0 port A: Ethernet address 00:0f:3d:87:e2:79
makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 3
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
esiop0 at pci1 dev 4 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at irq 15
esiop0: alloc new tag DSA table at PHY addr 0x16d5000
scsibus0 at esiop0: 16 targets, 8 luns per target
isp0 at pci0 dev 8 function 0: QLogic FC-AL and Fabric HBA
isp0: interrupting at irq 10
scsibus1 at isp0: 256 targets, 8 luns per target
isp1 at pci0 dev 9 function 0: QLogic FC-AL and Fabric HBA
isp1: interrupting at irq 5
scsibus2 at isp1: 256 targets, 8 luns per target
vga1 at pci0 dev 13 function 0: Cirrus Logic CL-GD5446 (rev. 0x45)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
raidattach: Asked for 8 units
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
scsibus2: waiting 2 seconds for devices to settle...
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <CD-224E, , 1.5A> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
cd0(piixide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data transfers)
esiop0: alloc newcdb at PHY addr 0x1824000
sd0 at scsibus0 target 0 lun 0: <HP, 9.10GB A 80-F309, > disk fixed
sd0: 8678 MB, 9827 cyl, 5 head, 361 sec, 512 bytes/sect x 17773524 sectors
sd0: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 1 lun 0: <HP, 9.10GB A 80-F309, > disk fixed
sd1: 8678 MB, 9827 cyl, 5 head, 361 sec, 512 bytes/sect x 17773524 sectors
sd1: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged queueing
sd2 at scsibus1 target 0 lun 0: <APPLE, Xserve RAID, 1.21> disk fixed
sd2: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x 2171240448 sectors
sd3 at scsibus2 target 0 lun 0: <APPLE, Xserve RAID, 1.21> disk fixed
sd3: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x 2171240448 sectors
Searching for RAID components...
warning: double match for boot device (sd0, sd1)
boot device: sd0
root on sd0a dumps on sd0b
mountroot: trying msdos...
mountroot: trying cd9660...
mountroot: trying nfs...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
init: copying out path `/sbin/init' 11
mag 0 21:1
mag 1 2e:2
mag 2 72:3
mag 3 65:4
mag 4 73:5
mag 5 65:6
mag 6 74:7
mag 7 2d:8
mag 8 78:9
mag 9 d:7f
wsdisplay0: screen 1 added (80x25, vt100 emulation)
wsdisplay0: screen 2 added (80x25, vt100 emulation)
wsdisplay0: screen 3 added (80x25, vt100 emulation)
wsdisplay0: screen 4 added (80x25, vt100 emulation)
esiop0: alloc newcdb at PHY addr 0x1606000


>How-To-Repeat:
# df -m /u5 /b1
Filesystem  1M-blocks     Used     Avail Capacity  Mounted on
/dev/sd2a     1043737    97984    893565     9%    /u5
/dev/sd3a     1043737   115438    876111    11%    /b1
# rsync -axH /u5 /b1/u5
uvm_fault(0xc75a6dc4, 0, 0, 1) -> 0xe
fatal page fault in supervisor mode
trap type 6 code 0 eip c02e08e7 cs 8 eflags 10246 cr2 0 ilevel 0
panic: trap
...

Core dump and debugging kernel are available. 

>Fix:
Unknown

>Unformatted: