Subject: kern/5681: nfs client lock during heavy nfs activity
To: None <gnats-bugs@gnats.netbsd.org>
From: None <laine@MorningStar.Com>
List: netbsd-bugs
Date: 06/30/1998 15:26:26
>Number:         5681
>Category:       kern
>Synopsis:       during heavy nfs activity, processes lock in nfsvinval, requiring reboot to clear
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun 30 12:35:01 1998
>Last-Modified:
>Originator:     Laine Stump
>Organization:
Ascend Communications
>Release:        1.3.1
>Environment:
	
System: NetBSD bass.morningstar.com 1.3.1 NetBSD 1.3.1 (GENERICASCEND) #0: Tue Apr 14 14:47:08 EDT 1998 laine@bass.morningstar.com:/amd/estrela/n/estrela/1/archive/OS/NetBSD/1.3.1/source/usr/src/sys/arch/i386/compile/GENERICASCEND i386


>Description:

We have a farm of NetBSD machines which share in a parallel compile
via pmake-customs and a gmake patch; individual compiles are handed
out to 7 machines which each have the same directory nfs-mounted acros
100Mbit ether. Under extreme load, one or more of the machines can get
a bunch of processes locked in nfsvinval. When this occurs, it seems
there is always one other process locked in getblock, and another in
vinval.

We dropped into the debugger and found that all these processes are
sleeping indefinitely. ddb gives a stack for the getblk as being:

  tsleep(f0fe37ec, 11, f015df19, 0)
  getblk(f0a82a00, 1, 2000, 0, 0)
  nfs_getcache(f0a82a00, 1, 2000, f0b07200, 1000)

and the stack for one of the nfsvinvals as:

  tsleep(f0aae27a, 12, f01a8636, 0)
  vinvalbuf+63
  nfsopen+10a

(Sorry for the varying formats; that's just the notes we took down on paper).

The processes that hang always seem to be hung (according to ktrace)
attempting to do an open() on the directory containing the source
files of the build (the preceding stat of the directory works just
fine, and all accesses to any other directories on the same disk work
wonderfully too).

>How-To-Repeat:

Create some sort of test the exercises an NFS server from lots of
clients, and run many (up to 15 or 20) of those test programs on each
client machine. Eventually you should get into this state.

>Fix:

We don't know if this helps or not, but we noticed several places
where modification of B_WANTED is not surrounded by splbio() and
splx(). We've changed all those in our kernels and are running like
that. Here's our diffs against 1.3.1 (I notice that at least the
kern_physio.c changes have been deprecated in -current):

*** kern_physio.c~      Mon May 19 07:22:27 1997
--- kern_physio.c       Tue Jun 30 11:33:53 1998
***************
*** 290,296 ****
--- 290,298 ----
  putphysbuf(bp)
        struct buf *bp;
  {
+       int s;
  
+       s = splbio();
          bp->b_actf = bswlist.b_actf;
          bswlist.b_actf = bp;
          if (bp->b_vp)
***************
*** 299,304 ****
--- 301,307 ----
                  bswlist.b_flags &= ~B_WANTED;
                  wakeup(&bswlist);
          }
+         splx(s);
  }
  
  /*
*** vfs_bio.c~  Wed Jul  9 15:50:18 1997
--- vfs_bio.c   Tue Jun 30 11:35:05 1998
***************
*** 437,451 ****
                wakeup(&needbuffer);
        }
  
        /* Wake up any proceeses waiting for _this_ buffer to become free. */
        if (ISSET(bp->b_flags, B_WANTED)) {
                CLR(bp->b_flags, B_WANTED);
                wakeup(bp);
        }
  
-       /* Block disk interrupts. */
-       s = splbio();
- 
        /*
         * Determine which queue the buffer should be on, then put it there.
         */
--- 437,451 ----
                wakeup(&needbuffer);
        }
  
+       /* Block disk interrupts. */
+       s = splbio();
+ 
        /* Wake up any proceeses waiting for _this_ buffer to become free. */
        if (ISSET(bp->b_flags, B_WANTED)) {
                CLR(bp->b_flags, B_WANTED);
                wakeup(bp);
        }
  
        /*
         * Determine which queue the buffer should be on, then put it there.
         */
***************
*** 854,861 ****
--- 854,863 ----
        } else if (ISSET(bp->b_flags, B_ASYNC)) /* if async, release it */
                brelse(bp);
        else {                                  /* or just wakeup the buffer */
+               int s = splbio();
                CLR(bp->b_flags, B_WANTED);
                wakeup(bp);
+               splx(s);
        }
  }
  
*** vfs_cluster.c~      Tue Apr 14 12:37:28 1998
--- vfs_cluster.c       Tue Jun 30 11:35:13 1998
***************
*** 486,493 ****
--- 486,495 ----
        if (bp->b_flags & B_ASYNC)
                brelse(bp);
        else {
+               int s = splbio();
                bp->b_flags &= ~B_WANTED;
                wakeup((caddr_t)bp);
+               splx(s);
        }
  }
  

>Audit-Trail:
>Unformatted: