Subject: Re: nfsd: locking botch in op %d
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Frank van der Linden <firstname.lastname@example.org>
Date: 03/08/2001 14:43:07
On Thu, Mar 08, 2001 at 06:40:16AM -0500, der Mouse wrote:
> The NFS server on my house LAN's NFS subnet fell over with "nfsd:
> locking botch in op 3". Investigating, I find this comes from
> nfs_syscalls.c, where there's a recommendation to audit the relevant
> entry in nfsrv3_procs, which in this case is nfsrv_lookup.
Yes, this has been seen before. The case that was reported before
was a netbsd-1-5 branch kernel as a server, and a Linux client,
running 'du -a'. It also crashed when doing a lookup for a device
node ("sd0a" in your case), curiously enough, so there may be a problem
Data collected from the other report showed that the locking problem
was not inside the NFS server code (nfs_namei() in nfs_subs.c) itself.
There was a lock mismatch already when lookup() called from there
returned. So there must be a deeper problem somewhere, possibly
related to looking up a device node.
Unfortunately, tracking this down basically means either reading
through a lot of code, or changing every vnode lock call into
a debug statement, saving the current line and file, as well
as maintaining a linked list of locked vnodes for each process.
If you could look into this one as well, that'd be great. Are
you using softdeps on the server, btw?