Subject: Re: ath hickups ?
To: None <current-users@NetBSD.org>
From: David Young <dyoung@pobox.com>
List: current-users
Date: 06/09/2007 14:43:52
--MfFXiAuoTsnnDAfZ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sat, Jun 09, 2007 at 08:42:29PM +0200, Frank Kardel wrote:
> Hi *,
> 
> I am seeing quite a few device timeout errors with my ath0 device in 
> -current
> ===
> ath0: interrupting at ioapic0 pin 16 (irq 11)
> ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
> ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 
> 24Mbps 36Mbps 48Mbps 54Mbps
> ath0: mac 5.9 phy 4.3 radio 4.6
> 
> 00:09.0 Ethernet controller: Atheros Communications, Inc. AR5212 
> 802.11abg NIC (rev 01)
>        Subsystem: D-Link System Inc D-Link AirPlus DWL-G520 Wireless 
> PCI Adapter(rev.B)
>        Flags: bus master, medium devsel, latency 80, IRQ 11
>        Memory at fb200000 (32-bit, non-prefetchable)
>        Capabilities: [44] Power Management version 2
> ===
> 
> I seem to remember that there where times where ath0 was working more 
> reliably with NetBSD -
> can anyone share this observation ?

Sorry, my bad.  Looks like I introduced a new bug as I repaired another.

The problem is this: roughly speaking, ath_tx_processq() returns the
number of transmissions acknowledged by the receiver.  It does not return
the number of transmit descriptors that the NIC is finished with, as I
had assumed.  So if your NIC has sent only multicast traffic, which does
not require an 802.11 Acknowledgement, then ath_tx_processq() will always
be 0.  So ath_tx_processq's callers are going to think the NIC is not
finishing any descriptors, when really the NIC is.  Two things will go
wrong: first, ath will exhaust its descriptors, stalling transmissions.
Finally, the driver will countdown sc_tx_timer = 5, 4, ..., 0, and
then timeout.  Timing out resets the h/w and drains the transmit rings,
which is correct, but drastic; it is not going to help your traffic
flow smoothly.

Thanks Mindaugus for prodding me to give this a look.

Please give this patch a shot.

Dave

-- 
David Young             OJC Technologies
dyoung@ojctech.com      Urbana, IL * (217) 278-3933 ext 24

--MfFXiAuoTsnnDAfZ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="ath-nfin.patch"

? .ath.c.swp
? .wi.c.swp
Index: ath.c
===================================================================
RCS file: /cvsroot/src/sys/dev/ic/ath.c,v
retrieving revision 1.83
diff -p -u -u -p -r1.83 ath.c
--- ath.c	17 Apr 2007 22:01:43 -0000	1.83
+++ ath.c	9 Jun 2007 19:39:13 -0000
@@ -4011,14 +4011,14 @@ ath_tx_processq(struct ath_softc *sc, st
 	struct ath_desc *ds, *ds0;
 	struct ieee80211_node *ni;
 	struct ath_node *an;
-	int sr, lr, pri, nacked;
+	int sr, lr, pri, nfin;
 	HAL_STATUS status;
 
 	DPRINTF(sc, ATH_DEBUG_TX_PROC, "%s: tx queue %u head %p link %p\n",
 		__func__, txq->axq_qnum,
 		(void *)(uintptr_t) ath_hal_gettxbuf(sc->sc_ah, txq->axq_qnum),
 		txq->axq_link);
-	nacked = 0;
+	nfin = 0;
 	for (;;) {
 		ATH_TXQ_LOCK(txq);
 		txq->axq_intrcnt = 0;	/* reset periodic desc intr count */
@@ -4042,6 +4042,8 @@ ath_tx_processq(struct ath_softc *sc, st
 		ATH_TXQ_REMOVE_HEAD(txq, bf_list);
 		ATH_TXQ_UNLOCK(txq);
 
+		nfin++;
+
 		ni = bf->bf_node;
 		if (ni != NULL) {
 			an = ATH_NODE(ni);
@@ -4076,12 +4078,6 @@ ath_tx_processq(struct ath_softc *sc, st
 			 */
 			if ((ds->ds_txstat.ts_status & HAL_TXERR_FILT) == 0 &&
 			    (bf->bf_flags & HAL_TXDESC_NOACK) == 0) {
-				/*
-				 * If frame was ack'd update the last rx time
-				 * used to workaround phantom bmiss interrupts.
-				 */
-				if (ds->ds_txstat.ts_status == 0)
-					nacked++;
 				ath_rate_tx_complete(sc, an, ds, ds0);
 			}
 			/*
@@ -4104,7 +4100,7 @@ ath_tx_processq(struct ath_softc *sc, st
 		STAILQ_INSERT_TAIL(&sc->sc_txbuf, bf, bf_list);
 		ATH_TXBUF_UNLOCK(sc);
 	}
-	return nacked;
+	return nfin;
 }
 
 static inline int
@@ -4148,23 +4144,23 @@ ath_tx_proc_q0123(void *arg, int npendin
 {
 	struct ath_softc *sc = arg;
 	struct ifnet *ifp = &sc->sc_if;
-	int nacked;
+	int nfin;
 
 	/*
 	 * Process each active queue.
 	 */
-	nacked = 0;
+	nfin = 0;
 	if (txqactive(sc->sc_ah, 0))
-		nacked += ath_tx_processq(sc, &sc->sc_txq[0]);
+		nfin += ath_tx_processq(sc, &sc->sc_txq[0]);
 	if (txqactive(sc->sc_ah, 1))
-		nacked += ath_tx_processq(sc, &sc->sc_txq[1]);
+		nfin += ath_tx_processq(sc, &sc->sc_txq[1]);
 	if (txqactive(sc->sc_ah, 2))
-		nacked += ath_tx_processq(sc, &sc->sc_txq[2]);
+		nfin += ath_tx_processq(sc, &sc->sc_txq[2]);
 	if (txqactive(sc->sc_ah, 3))
-		nacked += ath_tx_processq(sc, &sc->sc_txq[3]);
+		nfin += ath_tx_processq(sc, &sc->sc_txq[3]);
 	if (txqactive(sc->sc_ah, sc->sc_cabq->axq_qnum))
 		ath_tx_processq(sc, sc->sc_cabq);
-	if (nacked) {
+	if (nfin) {
 		sc->sc_lastrx = ath_hal_gettsf64(sc->sc_ah);
 		ifp->if_flags &= ~IFF_OACTIVE;
 		sc->sc_tx_timer = 0;
@@ -4185,16 +4181,16 @@ ath_tx_proc(void *arg, int npending)
 {
 	struct ath_softc *sc = arg;
 	struct ifnet *ifp = &sc->sc_if;
-	int i, nacked;
+	int i, nfin;
 
 	/*
 	 * Process each active queue.
 	 */
-	nacked = 0;
+	nfin = 0;
 	for (i = 0; i < HAL_NUM_TX_QUEUES; i++)
 		if (ATH_TXQ_SETUP(sc, i) && txqactive(sc->sc_ah, i))
-			nacked += ath_tx_processq(sc, &sc->sc_txq[i]);
-	if (nacked) {
+			nfin += ath_tx_processq(sc, &sc->sc_txq[i]);
+	if (nfin) {
 		sc->sc_lastrx = ath_hal_gettsf64(sc->sc_ah);
 
 		ifp->if_flags &= ~IFF_OACTIVE;

--MfFXiAuoTsnnDAfZ--