Subject: Lockup under heavy network use
To: None <port-cobalt@netbsd.org, port-macppc@netbsd.org,>
From: John Klos <john@ziaspace.com>
List: port-cobalt
Date: 09/08/2005 18:29:30
Hello,
I'm seeing some interesting lockup problems on two different machines. One
is a 200 MHz PowerPC 603e system, the other a 250 MHz Cobalt Raq2. Both
are serving around 20 to 30 Mbps of web traffic, which is about as much as
they can serve. I didn't want faster systems because I didn't want to use
much more bandwidth than that (and altq is not exactly production ready
yet). However, both of them have locked up under heavy network use. The
symptoms are the same: they still respond to ICMP on both IPv4 and IPv6,
but don't actually answer requests. Unfortunately, both are colocated, and
neither has a serial terminal or console (yet).
The only thing which resembles a clue otherwise is seeing this on a root
shell on the Cobalt right before the last lockup:
free(100676a8) bad block. (memtop = 100b3800 membot = 10058550)
free(10067688) bad block. (memtop = 100b3800 membot = 10058550)
free(10067668) bad block. (memtop = 100b3800 membot = 10058550)
free(10068608) bad block. (memtop = 100b3800 membot = 10058550)
free(10068c08) bad block. (memtop = 100b3800 membot = 10058550)
free(10067648) bad block. (memtop = 100b3800 membot = 10058550)
On the PowerMac, I was getting these from time to time, but that hardly
seems all that bad:
wm0: excessive collisions
wm0: late collision
wm0: excessive collisions
wm0: excessive collisions
wm0: late collision
wm0: excessive collisions
wm0: late collision
Both systems are running NetBSD 2.1_RC3. netstat -m shows that they are
nowhere near exhausting their nmbclusters (which is set to 16k).
Any ideas?
Thanks,
John Klos
--
I've seen Sun monitors on fire off the side of the multimedia lab.
I've seen NTU lights glitter in the dark near the Mail Gate.
All these things will be lost in time, like the root partition last week.
Time to die...
-- Peter Gutmann