Subject: Possible serious bug in NetBSD-1.6.1_RC2
To: None <current-users@netbsd.org>
From: Brian Buhrow <buhrow@lothlorien.nfbcal.org>
List: current-users
Date: 03/10/2003 16:16:23
	Hello folks.  I've got a machine running NetBSD-1.6.1_RC2 with sources
as of February 28, 2003.  This machine, unlike others I have running the
same code, consistently either hangs or panics every 24-48 hours.  The
primary difference between this machine and the rest of the ones I have
running the same code is that it is using a raidframe raid5 device for all
of its disk storage.  When it hangs, it remains pingable on the net, but
cannot be interrigated via the serial console and must be reset.
	I was able to capture the latest panic dump, and it looks like it is
taking an illegal page fault while trying to run the syncer kernel thread.
Specifically, it faulted in genfs_putpages() as a result of an
ffs_full_sync().  In the excerpt from the dmesg of the crash below, it
double panics because the lockmgr can't get a lock to sync the disks.
	Does anyone have any ideas?  This is an I386 machine, and it is almost
unusable as a server in its current state.  I have a full panic core file,
if that would help.  I'm also willing to try things if folks have
suggestions.
-thanks
-Brian

NetBSD 1.6.1_RC2 (NFBNETBSD) #0: Fri Mar  7 08:23:54 PST 2003
    buhrow@lothlorien.nfbcal.org:/usr/local/netbsd/src/sys/arch/i386/compile/NFBNETBSD
cpu0: Intel Pentium III (Coppermine) (686-class), 756.83 MHz
cpu0: I-cache 16 KB 32b/line 4-way, D-cache 16 KB 32b/line 2-way
cpu0: L2 cache 256 KB 32b/line 8-way
cpu0: features 383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR>
cpu0: features 383f9ff<PGE,MCA,CMOV,FGPAT,PSE36,MMX>
cpu0: features 383f9ff<FXSR,SSE>
total memory = 126 MB
avail memory = 112 MB
[...]
Kernelized RAIDframe activated
RAID autoconfigure
Configuring raid0:
RAIDFRAME: protectedSectors is 64
RAIDFRAME: Configure (RAID Level 5): total number of sectors is 304213760 (148541 MB)
RAIDFRAME(RAID Level 5): Using 20 floating recon bufs with head sep limit 10
boot device: raid0
root on raid0a dumps on wd0b
root file system type: ffs
raid0: Device already configured!
uvm_fault(0xc05d7320, 0xffc00000, 0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 eip c0311347 cs 8 eflags 10202 cr2 ffc000c4 cpl 0
panic: trap
syncing disks... panic: lockmgr: locking against myself

dumping to dev 0,1 offset 1837871
dump 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1