Subject: Re: NFS locking
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: David Laight <david@l8s.co.uk>
List: netbsd-help
Date: 09/02/2005 19:48:31
On Fri, Sep 02, 2005 at 12:42:12PM +0200, Manuel Bouyer wrote:
> 
> I think you should try TCP mounts, then. The TCP flow control may work around
> the issue.
> The problem here is that, with UDP mounts (the default), if a packet gets
> lost, the system will send again the exact same 32k stream of packets,
> which will get lost again, because the NIC on the other end can't handle
> a burst of 32k.

Actually they are a slightly different stream of 32k packets!
Amongst other things this is a DoS attack because the receiving system
has a whole pile of incomplete IP messages awaiting reassembly.

Given that many our our network card drivers (and many of the ones on
Linux) run with a pityfully small number of rx (and tx for that matter)
buffer descriptors - often 8 rx and 2 tx - it is hardly surprising that
they cannot corrently receive the 32k UDP datagrams that NetBSD i386 is
configured to send by default since they will be around 24 full length
frames.

Systems capable of saturating the network segment will be ok, but otherwise
a slow receiving system will always lose.

Until, and unless, all the network cards have sufficient rx buffering
for several (look at the traffic pattern for copying a large file to
an nfs mounted filesystem) 32k buffers - so you might want 128 rx buffers -
we really ought to reduce the default NFS block size back to the
industry-wide default of 8192.

I've also seen cases where the disk drivers 'elevator' algorithm will
starve one of the nfs server processes causing timeouts and retransmittions.
Once this happens the degredation is absolutely catastrophic.
The only solution there is to reduce the number of nfs server processes
from the (usual) default of 4 to 1.

	David

-- 
David Laight: david@l8s.co.uk