tech-net: problems with outgoing packets, no mbufs

Subject: problems with outgoing packets, no mbufs
To: None <tech-net@netbsd.org>
From: Michael C. Richardson <mcr@sandelman.ottawa.on.ca>
List: tech-net
Date: 09/11/1998 10:04:52
  I am experiencing problems with a de0 device. Specifically, I get periods
of very slow network performance. This manifests itself as slow NFS, and
very unresponsive SSH/xterms. 
  Looking at the output of netstat -s, I see:

ip:
        2704197 packets sent from this host
        0 packets sent with fabricated ip header
        3207065 output packets dropped due to no bufs, etc.
...
tcp:
                210930595 old duplicate packets
                1568 packets with some dup. data (155168 bytes duped)
                231 out-of-order packets (63603 bytes)
                2565 packets (629267 bytes) of data after window
                0 window probes
                1191227 window update packets
                0 packets received after close
                1222220 discarded for bad checksums
                120549984 discarded for bad header offset fields
	

istari-[~] mcr 832 %netstat -m
40 mbufs in use:
        37 mbufs allocated to data
        2 mbufs allocated to packet headers
        1 mbufs allocated to socket names and addresses
34/124 mapped pages in use
253 Kbytes allocated to network (28% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

  This is a 10Mb/s 10baseT/10base2 combination network, dumb hub.
  There are 7 hosts on this network: firewall, file server, Xserver, 
sun3, sparc 2, two 486s. All but the sparc 2 (which is running Solaris 2.5.1
right now) are running NetBSD.
  No other machines have difficulty talking to each other.
  This problem seems to have developed recently. I had an ep0 in the machine
and actually have put it back in place, but it wasn't configured for 10baseT,
and I haven't spent any time to run the config utility again.

  I'm looking for suggestions...
  It may be that the problem is really with my Xterminal. The TCP stats 
seem to suggest that in fact the poor response may be due to constant
retransmits by my Xterminal. The Xterminal is running:
NetBSD 1.3F (XTERM) #2: Sat Jul 25 19:26:58 EDT 1998
    mcr@istari.sandelman.ottawa.on.ca:/corp/network/kernels/compile-current/XTER
M
cpu0: Cyrix 6x86 (486-class)
real mem  = 33161216
avail mem = 28549120
...
ne2 at pci0 dev 9 function 0: Realtek 8029 Ethernet
ne2: Ethernet address 00:00:b4:58:92:db
ne2: interrupting at irq 10


The file server is:

NetBSD 1.3F (SSW) #1: Thu Aug 13 12:00:01 EDT 1998
    mcr@istari.sandelman.ottawa.on.ca:/corp/network/kernels/compile-current/SSW
cpu0: family 5 model 1 step 4
cpu0: AMD K5 (586-class)
real mem  = 66715648
avail mem = 59699200
...
de0 at pci0 dev 20 function 0
de0: interrupting at irq 11
de0: SMC 21041 [10Mb/s] pass 2.1
de0: address 00:e0:29:11:0e:72
..
de0: enabling 10baseT port
de0: abnormal interrupt: receive process stopped

  This last message has existed since 1.3_BETA when I put the card 
in place in February, but things were okay until recently. Maybe someone
is spamming me with TCP segments, but I haven't been able to find that
with tcpdump.
  The netstat -s output from my xterminal seems to be someone more normal.
  Is this just cabling? If so, any suggestions on diagnosis?

  It tends to go away with a reboot. One possible reason why I didn't
see this before was because my NCR SCSI card was too unreliable. It has
been replaced with an Adaptec 2940UW, with great results, except that I
still can't add my SCA drive... 

   :!mcr!:            |  Network and security consulting/contract programming
   Michael Richardson |         Firewalls, TCP/IP and Unix administration
 Personal: mcr@sandelman.ottawa.on.ca. PGP key available.
 Corporate: sales@sandelman.ottawa.on.ca. 
	ON HUMILITY: To err is human, to moo bovine.