Subject: Re: Another serious bug in NetBSD-1.6.1
To: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
From: Rafal Boni <rafal@attbi.com>
List: current-users
Date: 03/12/2003 20:53:15
[This was originally on port-i386... I'm adding adding tech-kern and, port-
sparc64 to CCs since I'm seeing panics stemming from a similar code path on
sparc64...]
In message <200303120150.h2C1oIa03881@lothlorien.nfbcal.org>, Brian writes:
[...]
-> I'm getting a double panic which looks like:
-> uvm_fault(0xc05d7300, 0xffc00000, 0, 1) -> e
-> fatal page fault in supervisor mode
-> trap type 6 code 0 eip c0311347 cs 8 eflags 10202 cr2 ffc000c4 cpl 0
-> panic: trap
-> syncing disks... panic: lockmgr: locking against myself
->
-> The first argument in the uvm_fault message varies by 20 bytes or so, but
-> the other two arguments, along with the error code at the end, are always
-> the same. The error code is EFAULT and the ffc0000 argument corresponds
-> to this definition in /usr/src/sys/arch/i386/i386/locore.s
->
-> /*
-> * APTmap, APTD is the alternate recursive pagemap.
-> * It's used when modifying another process's page tables.
-> *
-> * XXX 4 == sizeof pde
-> */
-> .set _C_LABEL(APTmap),(PDSLOT_APTE << PDSHIFT)
-> .set _C_LABEL(APTD),(_C_LABEL(APTmap) + PDSLOT_APTE * NBPG)
-> .set _C_LABEL(APTDpde),(_C_LABEL(PTD) + PDSLOT_APTE * 4)
->
->
-> These panics occur when the syncer kernel thread is running.
-> Specifically, genfs_putpages, which is called from ffs_putpages.
Interesting, I've been seeing panics on sparc64 in a similar code path.
The two panic messages and backtraces noted below. However, my machine
is running -current, not 1.6.x
panic: kernel diagnostic assertion "(data & TLB_NFO) == 0" failed: file "/extra/src-current/sys/arch/sparc64/sparc64/pmap.c", line 2586
With a backtrace of:
pmap_clear_modify(93fb930, ffffffffffffe000, 1, c, 0, 1c09c80) at pmap_clear_mod
ify+0x94
genfs_putpages(0, 11, 92223c0, 0, ffff0002, 11b0000) at genfs_putpages+0x4f8
ffs_putpages(92377d0, 1094ba4, 188, 1e3d000, 1863c00, 1821800) at ffs_putpages+0
xdc
VOP_PUTPAGES(9a485b0, 0, 0, 11, 4df0e0, 0) at VOP_PUTPAGES+0x30
ffs_full_fsync(9237a90, 10012, 108, 1e3d000, 8d6000, 0) at ffs_full_fsync+0xa4
ffs_fsync(9237a90, 10945ac, 98, 1e3d000, 0, 8) at ffs_fsync+0x34
VOP_FSYNC(9a485b0, 1e3bf80, 0, 0, 0, 8a0e9c0) at VOP_FSYNC+0x38
ffs_sync(0, 3, 1e3bf80, 8a0e9c0, 1093160, 1866a70) at ffs_sync+0xf0
sync_fsync(9237d10, 10fe3ec, 98, 1e3d200, 11318cc, 1c09c80) at sync_fsync+0x6c
VOP_FSYNC(926ba40, 1e3bf80, 8, 0, 0, 8a0e9c0) at VOP_FSYNC+0x38
sched_sync(180c800, 1808c00, 1806c00, 11b0800, 1863c00, 1821800) at sched_sync+0xf8
and:
trap type 0x34: pc=100a9b8 npc=100a9bc pstate=ffffffff90820006<PRIV,IE>
kernel trap 34: mem address not aligned
Stopped in pid 5.1 (ioflush) at pseg_get+0x3c: ldxa [%o2 + %g0] 20, %o2
db> tr
genfs_putpages(0, 11, 92223c0, 0, ffff0002, 11b0000) at genfs_putpages+0x4f8
ffs_putpages(92377d0, 1094ba4, 188, 1e3d000, 1863c00, 1821800) at ffs_putpages+0
xdc
VOP_PUTPAGES(9228a00, 0, 0, 11, 0, ffffffffffffbf80) at VOP_PUTPAGES+0x30
ffs_full_fsync(9237a90, 10012, 108, 1e3ce00, ffff0002, 0) at ffs_full_fsync+0xa4
ffs_fsync(9237a90, 10945ac, 98, 1e3d000, 0, ffffffffffffc1d0) at ffs_fsync+0x33
VOP_FSYNC(9228a00, 1e3bf80, 0, 0, 0, 8a0e9c0) at VOP_FSYNC+0x38
ffs_sync(0, 3, 1e3bf80, 8a0e9c0, 1093160, 0) at ffs_sync+0xf0
sync_fsync(9237d10, 10fe3ec, 98, 1e3d200, 10ca2e4, 1c09c80) at sync_fsync+0x6c
VOP_FSYNC(93f2c40, 1e3bf80, 8, 0, 0, 8a0e9c0) at VOP_FSYNC+0x38
sched_sync(180c800, 1808c00, 1806c00, 11b0800, 1863c00, 1821800) at sched_sync+0xf8
--rafal
----
Rafal Boni rafal@attbi.com
We are all worms. But I do believe I am a glowworm. -- Winston Churchill