Subject: Re: panic message ?
To: Patrick Welche <prlw1@cam.ac.uk>
From: Frank van der Linden <frank@wins.uva.nl>
List: current-users
Date: 06/22/1997 16:08:11
Quoting Patrick Welche,
[ffs_valloc: dup alloc panic]
> Actually, fsck does find and correct the error every time. It most
> often is UNKNOWN FILE TYPE I=7840. The question is why is it happening
> so often? Stating from a clean, checked disk, suddenly a panic
> occurs. It happened again just now while running a script to create a
> bunch of directories. Everything is OK as long as disk access is
> low. This is via an Adaptec 1542B, on a P133, running a sup from 2
> days ago.
What might be your problem is a race condition that is present in
several FSs. It was pointed out to me for NFS by Naofumi Honda
a while ago, but is also present in FFS. I've sent this patch
to a few other people, and it seems to fix their problems. I
was hoping to fix all FSs that have this problem this way soon
(before 1.3). The eventual solution will look a little different,
but the principle is the same.
The problem is that allocating a new inode (or any FS specific data
pointed to by a vnode, rather) can block under busy circumstances,
so that another process can grab the same inode, and boom.
Could you try this patch and see if it helps you? It's for
ufs/ffs/ffs_vfsops.c
- Frank
*** ffs_vfsops.c.orig Fri Jun 13 13:26:46 1997
--- ffs_vfsops.c Mon Jun 16 21:40:30 1997
***************
*** 759,764 ****
--- 759,794 ----
return (allerror);
}
+ static int ffs_hashlock;
+ #define IHASH_WANT 0x01
+ #define IHASH_LOCK 0x02
+
+ int ffs_lockhash __P((int *));
+ int ffs_unlockhash __P((int *));
+
+ int
+ ffs_lockhash(lockp)
+ int *lockp;
+ {
+ if (*lockp & IHASH_LOCK) {
+ *lockp |= IHASH_WANT;
+ tsleep((caddr_t)lockp, PINOD, "ffs_hashlock", 0);
+ return (EBUSY);
+ }
+ *lockp |= IHASH_LOCK;
+ return (0);
+ }
+
+ int
+ ffs_unlockhash(lockp)
+ int *lockp;
+ {
+ if (*lockp & IHASH_WANT)
+ wakeup((caddr_t)lockp);
+ *lockp &= ~(IHASH_LOCK | IHASH_WANT);
+ return (0);
+ }
+
/*
* Look up a FFS dinode number to find its incore vnode, otherwise read it
* in from disk. If it is in core, wait for the lock bit to clear, then
***************
*** 781,792 ****
ump = VFSTOUFS(mp);
dev = ump->um_dev;
! if ((*vpp = ufs_ihashget(dev, ino)) != NULL)
! return (0);
/* Allocate a new vnode/inode. */
if ((error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp)) != 0) {
*vpp = NULL;
return (error);
}
type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
--- 811,826 ----
ump = VFSTOUFS(mp);
dev = ump->um_dev;
!
! do {
! if ((*vpp = ufs_ihashget(dev, ino)) != NULL)
! return (0);
! } while (ffs_lockhash(&ffs_hashlock));
/* Allocate a new vnode/inode. */
if ((error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp)) != 0) {
*vpp = NULL;
+ ffs_unlockhash(&ffs_hashlock);
return (error);
}
type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
***************
*** 812,817 ****
--- 846,852 ----
* disk portion of this inode to be read.
*/
ufs_ihashins(ip);
+ ffs_unlockhash(&ffs_hashlock);
/* Read in the disk contents for the inode, copy into the inode. */
error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)),