Subject: kern/10315: tlp has severe media negotiation problems when talking to Cisco 3524
To: None <>
From: None <>
List: netbsd-bugs
Date: 06/07/2000 15:33:09
>Number:         10315
>Category:       kern
>Synopsis:       "tlp" really, really loses when talking to a Cisco 3524.  The problems seem to be related to media negotiation.
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jun 07 15:34:00 PDT 2000
>Release:        20000527
	Stevens Institute of Technology
System: NetBSD 1.4Z NetBSD 1.4Z (BOCK) #1: Thu Jun 1 13:18:04 EDT 2000 i386

	We have several machines which are connected to Cisco 3524 switches
	with VLANs and QOS turned on (so I am told by our campus networking
	folks).  All ports on the Ciscos are set to autonegotiate speed and

	(Note: these are the *same* machines, on the same switch, for which
	"ex" appears totally unable to select media *at all*, see PR #10289)

	With "tlp" cards (MX98715A, in this case -- known-good cards that
	work in other machines on other switches, including other Cisco
	switches, and including in full-duplex mode) installed, if the 
	Ethernet cable is plugged in at the time the interface is brought 
	up the machine hangs hard (requiring power-cycling to reboot) about 
	0.5sec after the "ifconfig" operation that brings the interface up.
	It's not even possible to drop to DDB; on some of my machines the ATX
	soft-power button doesn't even work and I have to pull the power plug.

	(prior to actually bringing the interface up by giving it an IP
	address, the media type can be set, etc. and this does *not* hang
	the kernel)

	If the cable is *not* plugged in, the interface is brought up, and
	*then* the cable is plugged in, things work somewhat better, but not
	"right".  Media can be reset, etc, and the switch seems to pick up
	the media changes after a brief interval.  However, in full-duplex
	mode, though throughput is acceptable, the driver spits out tons
	of diagnostic messages, in the following sequence: "tlp0: dribbling
	bit error"; "tlp0: CRC error"; "tlp0: MII error".  Resetting media
	to 100baseTX half-duplex makes the errors go away.  I find "MII
	error" an interesting diagnostic as the boot-time messages did not
	show a PHY at all...

	Obviously, the real problem here is the hard hang if the cable's
	plugged in at initial ifconfig time.  I can't seem to find a 
	workaround for this, and it's pretty much a showstopper.

	Perhaps if we can figure out what's going on when "tlp" loses in
	this configuration we can figure out why "ex" does, too, and close
	PR #10289.

	These cards are "SohoWare Fast" (NDC Communications model SFA110A
	revision B4) MX98715AEC-C cards.  The chip probes as a MX98713A
	pass 2.5.  The probe claims that "auto" is among the supported
	media types but ifconfig doesn't seem to think so.

	The cards are in Dell Precision Workstation 410 systems with a
	single processor.


	Boot an install disk on a machine with a "tlp" plugged into a
	Cisco 3524; try to configure an IP address; watch system choke.