netbsd-help: NFS...pilot error?

Subject: NFS...pilot error?
To: None <netbsd-help@netbsd.org>
From: Richard Rauch <rauch@rice.edu>
List: netbsd-help
Date: 12/09/2001 19:25:41

I've set up an NFS mount for sharing a subdirectory.  For the sake of
concreteness, the situation is:

 hermes (tower with lots of disk space); exports /usr/nfs-export via NFS.

 odysseus (laptop); mounts hermes:/usr/nfs-export onto /hermes.


Since fiddling with my TCP settings & MTU to cope with the MSS/PPPoE
problems, I now find that (at least on reading large files over NFS) I
tend to get odysseus locked up waiting on NFS requests.  The only way to
get out of a stalled NFS access is to cycle odysseus's power and wait on
an fsck.  (Somewhere, I know that there's an option so that I can kill
processes waiting on NFS.  If I enable that, and use it, the only danger
is that application data may be corrupted if the client is trying to write
data, yes?)

The problems tend to be associated with buffer overruns.  (These do not
seem to be inherently fatal; I can get lots of overruns in moving files
with scp, without real mishap.)  I assume that I can sort out the
immediate problem by adjusting my LAN's MTU (as a quick fix to the MSS/MTU
problems on PPPoE, I set my MTU low rather than the route MTU over the
default route).

However...this seems like a serious problem, and it seems to me that
something as heavily used as NFS should (if itself properly configured)
not suffer such serious side effects from something as minor as an MTU
problem.  (Especially since scp, etc., can cope quite well.)

Am I right?  Do I necessarily have a poorly configured NFS?  Or is NFS
just very touchy about being in an _environment_ that's properly
configured?


  ``I probably don't know what I'm talking about.'' --rauch@math.rice.edu