Subject: Re: kern/31455: ex (905[BC]) cards can hang in -current kernels
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: ITOH Yasufumi <itohy@netbsd.org>
List: netbsd-bugs
Date: 12/19/2005 14:20:02
The following reply was made to PR kern/31455; it has been noted by GNATS.

From: itohy@netbsd.org (ITOH Yasufumi)
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org
Subject: Re: kern/31455: ex (905[BC]) cards can hang in -current kernels
Date: Mon, 19 Dec 2005 23:18:26 +0900 (JST)

 Hello,
 
 I have similar hang on 2.1_STABLE.  Mine is
 
 > ex0 at pci0 dev 3 function 0: 3Com 3c556 MiniPCI 10/100 Ethernet (rev. 0x10)
 > ex0: interrupting at irq 11
 > ex0: MAC address 00:01:03:xx:xx:xx
 > tqphy0 at ex0 phy 0: 78Q2120 10/100 media interface, rev. 11
 > tqphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
 > 3Com 3c556 V.90 MiniPCI Modem (miscellaneous communications, revision 0x10) at pci0 dev 3 function 1 not configured
 
 I investigated the problem.
 The device have TX_COMPLETE|INTR_LATCH status on ex_intr:
 
 > int
 > ex_intr(arg)
 >	void *arg;
 > {
 ...
 >		stat = bus_space_read_2(iot, ioh, ELINK_STATUS);
 "stat" is 0x2005 here.
 So it calls ex_txstat (TX_COMPLETE == 4).
 >		if (stat & TX_COMPLETE) {
 >			ex_txstat(sc);
 >		}
 
 > static void
 > ex_txstat(sc)
 >	struct ex_softc *sc;
 > {
 ...
 >	while ((i = bus_space_read_2(iot, ioh, ELINK_TIMER)) & TXS_COMPLETE) {
 
 This conditional doesn't met, and the function just returs.
 Nothing done for TX_COMPLETE interrupt and the interrupt never cleared.
 Hence the hang.
 
 I tested modification like this.
 By this change, the kernel doesn't hang eternally,
 but short-term hang til timeout remains.
 
 > ex0: device timeout
 
 --- elinkxl.c.orig	Sat Oct 29 04:35:21 2005
 +++ elinkxl.c	Mon Dec 19 14:53:10 2005
 @@ -786,7 +786,8 @@ ex_txstat(sc)
  	 * We need to read+write TX_STATUS until we get a 0 status
  	 * in order to turn off the interrupt flag.
  	 */
 -	while ((i = bus_space_read_2(iot, ioh, ELINK_TIMER)) & TXS_COMPLETE) {
 +	i = bus_space_read_2(iot, ioh, ELINK_TIMER);
 +	do {
  		bus_space_write_2(iot, ioh, ELINK_TIMER, 0x0);
  
  		if (i & TXS_JABBER) {
 @@ -814,7 +815,7 @@ ex_txstat(sc)
  			sc->sc_ethercom.ec_if.if_flags &= ~IFF_OACTIVE;
  		} else
  			sc->tx_succ_ok = (sc->tx_succ_ok+1) & 127;
 -	}
 +	} while ((i = bus_space_read_2(iot, ioh, ELINK_TIMER)) & TXS_COMPLETE);
  }
  
  int
 
 Similar codepath exists also in 3-branch and in -current.
 Any thoughts?
 -- 
 ITOH Yasufumi