Subject: kern/10289: "ex" driver doesn't work with newer cards/onboard chips
To: None <gnats-bugs@gnats.netbsd.org>
From: None <tls@cs.stevens-tech.edu>
List: netbsd-bugs
Date: 06/05/2000 11:30:12
>Number:         10289
>Category:       kern
>Synopsis:       The first "ifconfig" operation makes "ex" permanently lose carrier with newer cards.
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jun 05 11:31:00 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     
>Release:        20000601
>Organization:
	Stevens Institute of Technology
>Environment:
System: NetBSD shiner-bock.cs.stevens-tech.edu 1.4Z NetBSD 1.4Z (BOCK) #1: Thu Jun 1 13:18:04 EDT 2000 root@amstel.cs.stevens-tech.edu:/usr/src/sys/arch/i386/compile/BOCK i386


>Description:
	On a newer "ex" (3c905) card such as a 3c905-TX-NM or the embedded
	"3c905-TX" on newer Dell machines such as the Precision Workstation
	410, the first "ifconfig" operation done on the card will make the
	card go from "active" to "no carrier" status, evidently after a
	failed attempt to negotiate media type with the switch.  Explicitly
	setting a media type (e.g. 100baseTX, 100baseTX-FDX, 10baseT) does not
	help.  Until the initial "ifconfig" is done, oddly enough, the card
	appears to have correctly autonegotiated 100baseTX-FDX mode with our
	Cisco 3524 switch. (you can check this with "ifconfig" if you *don't*
	set anything -- setting anything at all screws it up)

	I'm not sure why, but there's a substantial system pause -- a second
	or so -- after the initial "ifconfig" operation that breaks things,
	but not after subsequent ones.  I recall there used to be some
	DELAY calls in the mii or phy code; perhaps these are at fault?

	I have not been able to find our sole 905-TX-NM card in some time
	to reproduce this problem, but it did seem to occur there.  That
	card has a PHY that probes as a ukphy.  Older Dell machines with
	embedded "905-TX" chips have had nsphy and those don't seem to have
	the problem.  Our new ones have exphy and the problem exists; the
	problem *still* exists if exphy's not built into the kernel and
	ukphy attaches instead.

	It's getting hard to buy a machine from a number of major PC
	manufacturers that does *not* exhibit this problem -- anything
	built in the last few months with onboard 3Com ethernet and
	exphy seems to lose this way, and that describes pretty much every
	machine anyone naive on this campus, at least, has bought in the
	last few months.

	Curiously, ukphy listed all-zeroes for the OUI and whatever the
	other PHY identifier is.

>How-To-Repeat:
	Find a Dell Precision 410 or other new machine with onboard "ex".
	Connect to a Cisco 3524 switch (unconfirmed, but this probably
	happens with other switches as well -- it does happen with two
	of these machines back-to-back with an xover cable).  Boot an
	install disk.  Try to get the interface set up so you can install
	over the net.  Boom.

>Fix:
	Unknown.
>Release-Note:
>Audit-Trail:
>Unformatted: