Subject: Re: [HACKERS] PostgreSQL, NetBSD and NFS
To: Greg Copeland <greg@copelandconsulting.net>
From: Michael Hertrick <m.hertrick@neovera.com>
List: current-users
Date: 02/05/2003 20:48:59
I've been watching this thread since the beginning, and now that y'all
brought up networking, I believe I may have some useful suggestions in that
arena.

 Tom Lane <tgl@sss.pgh.pa.us> writes:
> I'm thinking maybe one or both LAN cards have a problem with packets
> exceeding a certain size.
>

Are all the intermediate network devices at layer 2 (switches)?  If so, a
simple look at counters for those ports involved would rule out or in any
problems with those network devices.
I'm sure that if you have an MTU of 1500 bytes across the board (on the
hosts and the switch(es)) then you will not have a problem with
fragmentation at that layer on 100 Mbit Ethernet.  Make sure you're at
100baseTX-FDX.

If you are using hubs, DO NOT use full duplex on your hosts.  A hub can not
function at full duplex, only half.

If there are any intermediate layer 3 devices (routers), it's possible for
them to fragment your packets.  Verify the MTU on any of these devices as
well as the appropriate duplex setting.

Run netstat -s after passing a good bit of traffic between the hosts in
question.  Don't forget to do the math to determine error percentages.
tcpdump could also reveal much about the packets such as their size and
contents, whether they are fragments, if the DF bit is set, which host was
the last to communicate, etc...  A tcpdump along with your application trace
may show you just the insight you needed to see.

Do you have any packet filters between the devices?  Make sure they're not
dropping anything you need.  I don't remember if NFS is one of these, but
some things like to talk from high-port to high-port for [certain] things
and high-port to low-port for other [certain] things.

One thing I'd try that is a surefire way to determine if your network
hardware is to blame, that is if you don't want to do all that crap above:
Run your scenerio with your two devices connected via an ethernet crossover
cable and NICs hard-coded to 100baseTX-FDX.  It'll rule out everything
except that cable and your NICs.

Speaking of NICs, some [really old] NICs may report they are running at
full-duplex when they really are not and can not.  Incrementing port error
counters (specifically, frame-check-sequence and collisions) will give this
away, though.


> > Is this purely a diagnostic suggestion?
>
> Well, if it changes anything then it would definitely show there's a
> hardware problem to fix...
>


--peace,
~~Mike.