Subject: ongoing strange network freezes with no error messages....
To: NetBSD Networking Technical Discussion List <tech-net@NetBSD.ORG>
From: Greg A. Woods <firstname.lastname@example.org>
Date: 07/07/2001 02:13:46
Well, I upgraded my wee pentium-150 router to NetBSD/i386 1.5W-20010624
tonight. This time I'd also increased NMBCLUSTERS=32768 in hopes of at
least masking the problem, if not fixing it.
Since the last few times it had been "stuck" was when I was playing a
128k MP3 stream and also doing a bunch of other NFS, FTP, etc. stuff and
making my LAN really busy at the same time, I thought I'd try
replicating these conditions as best as possible to see if the upgrade
and re-config made any difference.
Sure enough non-local traffic came to a grinding halt not long after I'd
started my little "tests".
Once again there were no errors logged anywhere and no apparent
starvation of mbufs (in fact 'netstat -m' reported only three (3!) in
use at the time. The only apparent clues are the dropped packets
reported by 'netstat -id'.
The only way to find out for sure what's wrong is to login on the
console and try pinging something on the LAN to see if ENOBUFS is
reported. (and the machine seems quite responsive given what it is....)
After "ifconfig rtk0 down; ping server; ifconfig rtk1 up" all's well
again! (rtk0 is the LAN, the other two are to DSL and cable modems)
Here's what things looked like shortly afterwards:
# netstat -m
2 mbufs in use:
1 mbufs allocated to data
1 mbufs allocated to packet headers
0/28 mapped pages in use
80 Kbytes allocated to network (0% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
# netstat -ind
Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Colls Drops
rtk0 1500 <Link> 00:48:54:1e:10:e6 71805 0 75638 1 4509 707
rtk0 1500 204.92.254 126.96.36.199 71805 0 75638 1 4509 707
rtk0 1500 fe80::/64 fe80::248:54ff:fe 71805 0 75638 1 4509 707
rtk1 1500 <Link> 00:50:bf:16:94:30 64734 0 55029 0 57 0
rtk1 1500 188.8.131.52 184.108.40.206 64734 0 55029 0 57 0
rtk1 1500 fe80::/64 fe80::250:bfff:fe 64734 0 55029 0 57 0
iy0 1500 <Link> 00:aa:00:cf:42:7c 9504 0 11721 0 4 0
iy0 1500 24.42.191/24 220.127.116.11 9504 0 11721 0 4 0
iy0 1500 fe80::/64 fe80::2aa:ff:fecf 9504 0 11721 0 4 0
lo0 33220 <Link> 5 0 5 0 0 0
lo0 33220 fe80::/64 fe80::1 5 0 5 0 0 0
lo0 33220 ::1/128 ::1 5 0 5 0 0 0
lo0 33220 127 127.0.0.1 5 0 5 0 0 0
Now interestingly enough this had happened a couple or six times earlier
today before I did the upgrade. The last time I finally got fed up and
just left a ping running on the console. Despite beating ever harder on
the LAN and the router connections for the rest of the day, no freezes
happened. It's as if the running ping kept things flowing despite
whatever condition apparently triggers the freeze.
Is there anything that'll tell me wny the dropped packets were dropped
(i.e. what condition prevented their transmission)? Are they simply due
to the collisions? Should I plug the router into my last spare switch
port and see if that changes anything?
Why doesn't the system recover on its own? I haven't waited forever,
but at least once I remember not noticing the problem for about 20
minutes. Once things freeze up like this all traffic backs off from
what I can see of the blinking lights. I'd think that would free up
enough of whatever to get things rolling again, but the only fix seems
to be to actually down the LAN interface.
I've got some trusty old 21041 PCI cards sitting idle at the moment (and
I've noticed they're still about the fastest 10mbit cards ever, beating
even the Intel fxp's on a much much faster machine!). Should I swap
them into the router and see if that changes anything?
BTW, I've got my switch and the managed hub the router's connected to
both generating SNMP traps when anything goes wildly wrong on the LAN
from their perspectives (and I do get traps even if I pull a connector),
but there's been not a peep from either.
Greg A. Woods
+1 416 218-0098 VE3TCP <email@example.com> <firstname.lastname@example.org>
Planix, Inc. <email@example.com>; Secrets of the Weird <firstname.lastname@example.org>