Subject: Re: uvm deadlock against nfs w/ loopback-interface mounts?
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 12/09/2002 13:20:36
On Mon, 9 Dec 2002, Jonathan Stone wrote:

> I beleive I've run into a deadlock between uvm and NFS.  It's
> reproducible on a i386, with several kernels, from yesterday's
> anoncvs back to 1.5ZC.
>
> The scenario: I have an app which uses stdio to read or write large
> files (e.g., twice as big as RAM in the machine).  The app will
> usually run over NFS.  To measure performance (at home, with a laptop)
> I tried exporting my /usr to 127.0.0.1, starting rpcbind, mountd,
> nfsds, and set vfs.nfs.iothreads to the same value (16, iirc).
> The app runs happily until its file-size hits roughly half available
> memory (just over 100Mbytes, for a 256M laptop). Then all processes
> quickly deadlock.
>
> Doing 'ps' as the app progresses show that first the app, then another
> process, are waiting on uvn_fp1
>
> Jumping into ddb shows that there are ~no free pages (show uvmexp says
> 2 free pages).  The pagedaemon is stuck waiting on nfsaio; the one
> nfsd which is waiting, is also waiting on uvn_fp1.  (its easily
> reproducible, at the cost of waiting for a slow laptop disk to fsck; I
> can find more data if that'd help).
>
> Looks like uvm is trying to cache the file contents twice: both on the
> client-side file and the server-side.  So far so good.  What's not
> good is that we end up with the pagedaemon waiting for an NFS
> operation, but the nfsd's is waiting for the pagedaemon to free up
> pages.
>
> I'm guesing I could retune uvm to try and keep a bigger reserveo free
> pages, so as to avoid the problem; but it'd be nice to fix it
> properly.  Anyone care to offer clue on what *is* a proper fix?

Unfortunately the proper fix at the moment is to not nfs mount a
filesystem served by the same system.

This deadlock issue has been known about for a while (like longer than
I've been a NetBSD developer), and there's no easy fix.

The problem isn't just for NFS. msdosfs can do it too (as Roland found out
a few years ago). The problem is that to free pages, sometimes we need
pages. NFS needs mbufs to send data to clean cache contents, msdosfs needs
to read blocks to figure out where to write blocks, etc.

While I'd love for us to fix it, I'm not 100% sure how to do it.

Take care,

Bill