current-users: Re: panic message ?

Subject: Re: panic message ?
To: Patrick Welche <prlw1@cam.ac.uk>
From: Frank van der Linden <frank@wins.uva.nl>
List: current-users
Date: 06/22/1997 16:08:11
Quoting Patrick Welche,

[ffs_valloc: dup alloc panic]

> Actually, fsck does find and correct the error every time. It most
> often is UNKNOWN FILE TYPE I=7840. The question is why is it happening
> so often? Stating from a clean, checked disk, suddenly a panic
> occurs. It happened again just now while running a script to create a
> bunch of directories. Everything is OK as long as disk access is
> low. This is via an Adaptec 1542B, on a P133, running a sup from 2
> days ago.

What might be your problem is a race condition that is present in
several FSs. It was pointed out to me for NFS by Naofumi Honda
a while ago, but is also present in FFS. I've sent this patch
to a few other people, and it seems to fix their problems. I
was hoping to fix all FSs that have this problem this way soon
(before 1.3). The eventual solution will look a little different,
but the principle is the same.

The problem is that allocating a new inode (or any FS specific data
pointed to by a vnode, rather) can block under busy circumstances,
so that another process can grab the same inode, and boom.

Could you try this patch and see if it helps you? It's for
ufs/ffs/ffs_vfsops.c

- Frank


*** ffs_vfsops.c.orig	Fri Jun 13 13:26:46 1997
--- ffs_vfsops.c	Mon Jun 16 21:40:30 1997
***************
*** 759,764 ****
--- 759,794 ----
  	return (allerror);
  }
  
+ static int ffs_hashlock;
+ #define IHASH_WANT 0x01
+ #define IHASH_LOCK 0x02
+ 
+ int ffs_lockhash __P((int *));
+ int ffs_unlockhash __P((int *));
+ 
+ int
+ ffs_lockhash(lockp)
+ 	int *lockp;
+ {
+ 	if (*lockp & IHASH_LOCK) {
+ 		*lockp |= IHASH_WANT;
+ 		tsleep((caddr_t)lockp, PINOD, "ffs_hashlock", 0);
+ 		return (EBUSY);
+ 	}
+ 	*lockp |= IHASH_LOCK;
+ 	return (0);
+ }
+ 
+ int
+ ffs_unlockhash(lockp)
+ 	int *lockp;
+ {
+ 	if (*lockp & IHASH_WANT)
+ 		wakeup((caddr_t)lockp);
+ 	*lockp &= ~(IHASH_LOCK | IHASH_WANT);
+ 	return (0);
+ }
+ 
  /*
   * Look up a FFS dinode number to find its incore vnode, otherwise read it
   * in from disk.  If it is in core, wait for the lock bit to clear, then
***************
*** 781,792 ****
  
  	ump = VFSTOUFS(mp);
  	dev = ump->um_dev;
! 	if ((*vpp = ufs_ihashget(dev, ino)) != NULL)
! 		return (0);
  
  	/* Allocate a new vnode/inode. */
  	if ((error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp)) != 0) {
  		*vpp = NULL;
  		return (error);
  	}
  	type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
--- 811,826 ----
  
  	ump = VFSTOUFS(mp);
  	dev = ump->um_dev;
! 
! 	do {
! 		if ((*vpp = ufs_ihashget(dev, ino)) != NULL)
! 			return (0);
! 	} while (ffs_lockhash(&ffs_hashlock));
  
  	/* Allocate a new vnode/inode. */
  	if ((error = getnewvnode(VT_UFS, mp, ffs_vnodeop_p, &vp)) != 0) {
  		*vpp = NULL;
+ 		ffs_unlockhash(&ffs_hashlock);
  		return (error);
  	}
  	type = ump->um_devvp->v_tag == VT_MFS ? M_MFSNODE : M_FFSNODE; /* XXX */
***************
*** 812,817 ****
--- 846,852 ----
  	 * disk portion of this inode to be read.
  	 */
  	ufs_ihashins(ip);
+ 	ffs_unlockhash(&ffs_hashlock);
  
  	/* Read in the disk contents for the inode, copy into the inode. */
  	error = bread(ump->um_devvp, fsbtodb(fs, ino_to_fsba(fs, ino)),