Subject: Re: nfsd: locking botch in op %d
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Tracy J. Di Marco White <gendalia@iastate.edu>
List: tech-kern
Date: 03/10/2001 12:07:13
}>> The NFS server on my house LAN's NFS subnet fell over with "nfsd:
}>> locking botch in op 3".
}
}> Yes, this has been seen before.  The case that was reported before
}> was a netbsd-1-5 branch kernel as a server, and a Linux client,
}> running 'du -a'.  It also crashed when doing a lookup for a device
}> node ("sd0a" in your case), curiously enough, so there may be a
}> problem there.

I've got the other case where this was reported.  The fileserver I've
crashed is running a rather large RAID5, and since I'd rather not
crash it, I've tried as best I can to create a second system that I
can crash.  The hardware is not identical, but the nfs exported fs is
a RAID5 with 6 1GB (instead of 18GB) disks, mounted using softdeps.
Using scp & tar instead of nfs & cp, I copied just over 500MB of the
filesystem tree that was consistently crashing the original machine
to the test system, nfs mounted it on the Linux machine, and utterly
failed to have any problems.  The test system was upgraded to the
same version of 1.5.1_ALPHA as the main fileserver from the same
source tree.  I'm about to try and recopy as much of the tree as
I can fit to give 'du -a' more to go through.

}> Are you using softdeps on the server, btw?
}
}No.  I've never even tried to use softdeps, anywhere.

I am using softdeps, which was where Frank thought the problem
might be.

Tracy J. Di Marco White
Project Vincent Systems Manager
gendalia@iastate.edu