Subject: kern/18051: tlp doesn't configure media, faults on apm reset
To: None <gnats-bugs@gnats.netbsd.org>
From: None <raeburn@raeburn.org>
List: netbsd-bugs
Date: 08/23/2002 13:45:54
>Number:         18051
>Category:       kern
>Synopsis:       tlp doesn't configure media, faults on apm reset
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 23 10:46:00 PDT 2002
>Closed-Date:
>Last-Modified:
>Originator:     Ken Raeburn <raeburn@raeburn.org>
>Release:        1.6 branch as of a few days ago
>Organization:
	not much
>Environment:
	
NetBSD 1.6 branch with kernel based on GENERIC 1.491.4.2
Architecture: i386
Machine: i386
>Description:

Following the lead of the GENERIC config, I updated my machine's
config to use the tulip driver rather than the de driver for my 4-port
card.

#de*	at pci? dev ? function ?	# DEC 21x4x-based Ethernet
tlp*	at pci? dev ? function ?	# DECchip 21x4x (and clones) Ethernet

I brought up my new kernel in single-user mode (so the fact that the
remaining user-land code is much older shouldn't matter much), and
walked away.

During boot, all four ports were identified as "DECchip 21143 Ethernet
pass 4.1".  All four also reported IRQ and ethernet address, and then
these messages:

tlp0: OUI 0x1000e8 model 0x0001 rev 0 at tlp0 phy 1 not configured
tlp0: unable to configure MII
tlp0: no media found!

But ports 0 and 2 are both connected.  The old de driver would report
100baseTX for one, and 10baseT for the other.

After a little while, the APM code kicked in, and when I came back, I
found the machine at the ddb prompt.  A stack trace showed:

	tlp_21142_reset+0x18
	tlp_reset
	tlp_stop
	tlp_power
	dopowerhooks
	...

In tlp_21142_reset is this code:

void
tlp_21142_reset(sc)
	struct tulip_softc *sc;
{
	struct ifmedia_entry *ife = sc->sc_mii.mii_media.ifm_cur;
	struct tulip_21x4x_media *tm = ife->ifm_aux;
	const u_int8_t *cp;
	int i;

	cp = &sc->sc_srom[tm->tm_reset_offset];
	for (i = 0; i < tm->tm_reset_length; i++, cp += 2) {

The instruction at +0x18 is the read of tm->tm_reset_offset, but ddb
indicates it's reading through a null pointer.  A printf statement
inserted confirms that tm is null at this point.


>How-To-Repeat:

I'm not sure why the "no media found" report comes up, or if it's
critical to reproducing the crash; perhaps bringing up a machine with
a tulip card unplugged from the net would be enough.  Then wait for
APM to shut it down.

>Fix:
	

Check for media info being a null pointer.
Drop network device if it has no media?
>Release-Note:
>Audit-Trail:
>Unformatted: