NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/53096: netbsd-8 crash on heavy disk I/O



The following reply was made to PR kern/53096; it has been noted by GNATS.

From: "J. Hannken-Illjes" <hannken%eis.cs.tu-bs.de@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: kern/53096: netbsd-8 crash on heavy disk I/O
Date: Sun, 18 Mar 2018 17:45:41 +0100

 The backtrace is a bit misleading, it really is:
 
 sys_chdir() -> vrele() -> vrelel() -> vstate_assert_change() -> vnpanic()
 
 This matches the panic from dmesg:
 
 ...
 cpu 0: ucode 0x1a->0x29
 cpu 1: ucode 0x1a->0x29
 cpu 2: ucode 0x1a->0x29
 cpu 3: ucode 0x1a->0x29
 vnode 0xfffffe82137bde70 flags 0x30<MPSAFE,LOCKSWORK>
   tag VT_UFS(1) type VDIR(2) mount 0xfffffe823dbb2008 typedata 0x0
   usecount 1 writecount 0 holdcount 1
   size 200 writesize 200 numoutput 0
   data 0xfffffe8213cce900 lock 0xfffffe82137bdfa0
   state BLOCKED key(0xfffffe823dbb2008 8) b1 c8 3a 00 00 00 00 00
   lrulisthd 0xffffffff814c6400
   tag VT_UFS, ino 3852465, on dev 0, 0 flags 0x0, nlink 3
   mode 040755, owner 1001, group 0, size 512
 panic: BLOCKED to LOADED with usecount 2 at vrelel:783
 
 Here vrelel() is:
 
 767  VSTATE_CHANGE(vp, VS_LOADED, VS_BLOCKED);
 768  mutex_exit(vp->v_interlock);
 ...
 778  recycle = false;
 779  VOP_INACTIVE(vp, &recycle);
 780  if (!recycle)
 781          VOP_UNLOCK(vp);
 782  mutex_enter(vp->v_interlock);
 783  VSTATE_CHANGE(vp, VS_BLOCKED, VS_LOADED);
 
 and VSTATE_CHANGE() expands to vstate_assert_change(), which is:
 
 315  KASSERTMSG(mutex_owned(vp->v_interlock), "at %s:%d", func, line);
 
 328  if ((from == VS_BLOCKED || to == VS_BLOCKED) && vp->v_usecount != 1)
 329          vnpanic(vp, "%s to %s with usecount %d at %s:%d",
 
 So the usecount of a blocked vnode with interlock held changed from 1,
 it is "2" on the call to vnpanic() and "1" when vnpanic prints
 the vnode.
 
 As vcache_vget() and vcache_tryvget() either error out or wait if the current
 state is BLOCKED it could be a vref() without a prior reference.
 
 Please try the attached patch to see if one of these assertions fire.
 
 diff -r 13173af16202 -r 0a76936d2ed0 sys/kern/vfs_vnode.c
 --- sys/kern/vfs_vnode.c
 +++ sys/kern/vfs_vnode.c
 @@ -670,11 +670,22 @@ static inline bool
  vtryrele(vnode_t *vp)
  {
  	u_int use, next;
 +	vnode_impl_t *vip = VNODE_TO_VIMPL(vp);
  
  	for (use = vp->v_usecount;; use = next) {
  		if (use == 1) {
  			return false;
  		}
 +
 +		membar_enter();
 +		if (vip->vi_state == VS_BLOCKED) {
 +			mutex_enter(vp->v_interlock);
 +			if (vip->vi_state == VS_BLOCKED) {
 +				vnpanic(vp, "vtryrele on BLOCKED vnode");
 +			}
 +			mutex_exit(vp->v_interlock);
 +		}
 +
  		KASSERT(use > 1);
  		next = atomic_cas_uint(&vp->v_usecount, use, use - 1);
  		if (__predict_true(next == use)) {
 @@ -865,6 +876,16 @@ vrele_async(vnode_t *vp)
  void
  vref(vnode_t *vp)
  {
 +	vnode_impl_t *vip = VNODE_TO_VIMPL(vp);
 +
 +	membar_enter();
 +	if (vip->vi_state == VS_BLOCKED) {
 +		mutex_enter(vp->v_interlock);
 +		if (vip->vi_state == VS_BLOCKED) {
 +			vnpanic(vp, "vref on BLOCKED vnode");
 +		}
 +		mutex_exit(vp->v_interlock);
 +	}
  
  	KASSERT(vp->v_usecount != 0);
  
 


Home | Main Index | Thread Index | Old Index