Subject: Re: [HACKERS] PostgreSQL, NetBSD and NFS
To: David Laight <david@l8s.co.uk>
From: Andrew Gillham <gillham@vaultron.com>
List: current-users
Date: 02/06/2003 10:27:04
On Wed, Feb 05, 2003 at 09:24:48PM +0000, David Laight wrote:
> > If he is using UDP rather than TCP
> > as the transport layer, another potential issue is that 32K requests will
> > end up as IP packets with a very large number of fragments, potentially
> > exposing some kind of network stack bug in which the last fragment is
> > dropped or corrupted.
> 
> Actually it is worse that that, and IMHO 32k UDP requests are asking for
> trouble.
> 
> A 32k UDP datagram is about 22 ethernet packets.  If ANY of them is
> lost on the network, then the entire datagram is lost.  NFS must
> regenerate the request on a timeout.  The receiving system won't
> report that it is missing a fragment.

As he stated several times, he has tested with TCP mounts and observed
the same issue.  So the above issue shouldn't be related.

> There are also an lot of ethernet cards out there which don't have
> enough buffer space for 32k of receive data.   Not to mention the
> fact that NFS can easily (at least on some systems) generate
> concurrent requests for different parts of the same file.
> 
> I would suggest reducing the size back to 8k, even that causes
> trouble with some cards.

If NetBSD as an NFS client is this fragile we have problems.  The default
read/write size shouldn't be 32kB if that is not going to work reliably.

> It should also be realised that transmitting 22 full sized, back
> to back frames on the ethernet doesn't do anything for sharing
> the bandwidth betweenn different users.  The MAC layer has to very
> aggressive in order to get a packet in edgeways (so to speak).

So what?  If it is a switched network, which I assume it is since he was
talking to the NetApp gigabit port earlier, then this is irrelevant.  Even
the $40 Fry's switches are more or less non-blocking. 

Even if he is saturating the local *hub*, it shouldn't cause NetBSD to fail,
it would just be rude. :-)

There could be some packet mangling on the network, checking the amount
of retransmissions on either end of the TCP connection should give you an
idea about that.

-Andrew