tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Network-related lockup, was Re: bnx(4) lockups?



On Thu, 28 Mar 2019, Masanobu SAITOH wrote:
Back to the first mail:

If connected to the Internet and traffic is flowing, it will lock solid after a while

Does the machine recover from the hard hang after stopping the traffic?
e.g. removing cable.

Removing the cable makes no difference.

  New patch:

     http://www.netbsd.org/~msaitoh/bnx-n7-20190326-0.dif
     http://www.netbsd.org/~msaitoh/bnx-n8-20190326-0.dif
     http://www.netbsd.org/~msaitoh/bnx-cur-20190326-0.dif

This diff might improve stability on heavy interrupt.
It seems that bnx(4) also doesn't support the flow control.
I'll add it in a few days.


New patches:

	http://www.netbsd.org/~msaitoh/bnx-n7-20190328-0.dif
	http://www.netbsd.org/~msaitoh/bnx-n8-20190328-0.dif

	And copy the latest bnxfw.h (rev. 1.5)
	http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/microcode/bnx/bnxfw.h

Further testing shows that this is not bnx(4)-specific. I can get the hang with wm(4) I350 (and I disabled all the on-board bnx(4) devices). I'm certain it's network-related though. I've brought the box up in single-user mode with filesystems mounted ready only. If I bring up bnx0 and run netio to it constantly, it will run for an hour without problems. If I bring up bnx1 too and run the same netio test (to bnx0), it hangs after a couple of minutes. This rules out storage drivers, firewalling, etc.

During some trials I've done an ifconfig bnx1 up after it was running OK for an hour and the machine locked immediately (even without assigning an address to bnx1).

Note this is an identical kernel to that running on a different model of machine adjacent to it faultlessly under continual heavy load. The machine itself is OK as it was running XenServer (Linux) without problems immediately prior to the NetBSD install.

--
Stephen


Home | Main Index | Thread Index | Old Index