Subject: Re: NFS problems
To: None <rick@snowhite.cis.uoguelph.ca>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 07/23/2002 10:38:21
On Tue, 23 Jul 2002 rick@snowhite.cis.uoguelph.ca wrote:
> Well, I'm not directly involved in the NetBSD code, but here are a few
> random comments that might be useful:
>
> - If you have too large a blocksize when using NFS over UDP, you'll
> see "IP Fragments dropped due to timeout" when you do "netstat -s".
> (These indicate that fragments of the large UPD datagram aren't
> making it through the network interconnect and cause serious
> performance degredation. The "fix" is to either reduce the read/write
> data size or switch to TCP. See "man mount_nfs" to find out how to
> do either of these.)
>
> - For NFS Version 2, the spec. (RFC 1094) stipulated a maximum of 8,192 bytes,
> so using a larger blocksize for V2 violates the spec. (I'm sure
> implementations do it and I'm sure some work, but if you want to be
> technically correct, you should only allow block sizes > 8192 for V3.)
>
> - The NFS code is much more sensitive to buffer cache race conditions
> than local file systems. Among other reasons is the fact that local
> file systems almost never take several seconds to do an I/O operation.
> (Mounting a really slow NFS server, like one with debugging printfs
> turned on is the best way to "find" these. Been there, have the T-shirt.
> Now, once you "find" them, fixing them can be great fun.) In the past,
> when I say mysterious intermittent hangs, it usually turned out to be
> buffer cache bugs. "ps axl" should give you a hint, based on what the
> processes are waiting on.
I would tend to think something like this is the problem. Remember that
Steve pointed out he's seeing the problem over the loopback, so I don't
think that packet fragmentation is the issue. :-)
Take care,
Bill