Subject: panic: bad dir
To: None <port-alpha@NetBSD.ORG>
From: Martin Grossman <grossman@BBN.COM>
List: port-alpha
Date: 05/27/1998 13:28:44
We are getting alot of these panics on 1 system, and a few on other systems.
All systems are exactly the same except the user load!
All are PC164 DEC motherboards with 512MB mem and a NCR scsi to a
10GB (WIDE) disk.
It seams to happen more often when high user load, and high NFS (client)
traffic.
It has happened on both local and NFS directories.
OUTPUT on console (and in /var/log/messages) (and in kernel dumps)
1) First bad
2) /usr: bad dir ino 7772 at offset 0: mangled entry
3) panic: bad dir
#1 is comming from ufs_lookup.c ufs_dirbadentry() because ep->d_reclen
isnot a multiple of 4
#2 is comming from ufs_lookup.c ufs_dirbad().
a) I've seen "/", "/var", "/usr", and "/nfs/XXX/u1" (first 3 are UFS)
b) various inodes (7772 is 4 levels deep below /usr)
c) its always at offset 0
>From running gdb -k /netbsd.1 /netbsd.1.core
I've figured out this much so far.....
1) we are in ufs_lookup() from an access() call
(ie backtrace is syscall,sys_access,namei,lookup,ufs_lookup,ufs_dirbad,panic)
2) 8 lines after label searchloop: in call to
VOP_BLKATOFF(vdp,dp->i_offset,NULL,&bp)
I do a print *vdp (vnode) and everything looks right
dp->i_offset is zero (which is fine)
I do a print *dp (inode) and everything looks right
I do a print *bp (buf) and everything looks right
I do a print *ep (dirent) (ie bp->b_data) and its nothing like a
directory entry!
It should contain an inode #, reclen, type, namelen, and a name
BUT
it contains 0x464c457f
0x00010102
0x00000000
0x00000000
0x90260002
0x00000001
0x00230000
0xfffffc00
0x00000040
0x00000000
This is the beginning of some ELF executable file!!!!!
Is there any known bug (fixed or not) in or around the disk buffer cache?
PS We are running NetBSD 1.2G (November 1997).