Subject: Re: ath seems still buggy
To: None <current-users@NetBSD.org, perry@piermont.com, smb@cs.columbia.edu>
From: Daniel Carosone <dan@geek.com.au>
List: current-users
Date: 10/20/2005 06:52:48
--dIV3CY5swYZsO92m
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Tue, Oct 18, 2005 at 06:05:52PM -0500, David Young wrote:
> I will tell you what I think is happening.  When ath(4) doesn't receive
> the AP's beacon for a while, it times out and tries to reassociate with
> the AP.  In a zone where there's 802.11 congestion, your ath might
> miss a lot of beacons and time-out often.

Hm. I had never seen the kinds of problems being discussed in this
thread.. until my latest kernel. Somewhere between these two dates,
something changed:

5704 -rwxr-xr-x   1 root  wheel  5828406 Oct  8 23:22 netbsd.264*
5704 -rwxr-xr-x   1 root  wheel  5830309 Oct 18 15:57 netbsd.265*

Comparing ident's, the only change in even vaguely relevant files is

-      $NetBSD: ath.c,v 1.59 2005/09/13 05:50:29 martin Exp $
+      $NetBSD: ath.c,v 1.60 2005/10/14 00:26:45 gdt Exp $

I don't think this change is the cause, but I'll try backing just it
out, but probably not until the weekend. (ident diffs for irrelevant
files happily provided on request.)

Anyway, now I often stop receiving traffic pretty shortly after
bringing up the interface.  Last night, after a few such cycles, it
seemed to stick and lasted for the rest of the evening. Tonight, it
hasn't yet done that (which has helped me test more :)

I do NOT see the link state DOWN/UP messages that would indicate the
card had disassociated - at least not until I manually force it with
"ifconfig ath0 down up".  There's also no other AP around to cause
congestion; nobody else's beacon frames visible.

However, it's clearly a problem that's related to, or cleared by, a
re-association. I found that setting the nwid or mode (auto) to its
current value is enough to force a reassoc, and traffic flows again.

A ping running at the time shows something interesting..

64 bytes from 203.17.37.65: icmp_seq=455 ttl=255 time=1.550 ms
64 bytes from 203.17.37.65: icmp_seq=456 ttl=255 time=1.557 ms
64 bytes from 203.17.37.65: icmp_seq=457 ttl=255 time=2.172 ms
(pause and manually trigger reassoc..)
64 bytes from 203.17.37.65: icmp_seq=562 ttl=255 time=4717.617 ms
64 bytes from 203.17.37.65: icmp_seq=563 ttl=255 time=3718.939 ms
64 bytes from 203.17.37.65: icmp_seq=564 ttl=255 time=2719.146 ms
64 bytes from 203.17.37.65: icmp_seq=565 ttl=255 time=1719.184 ms
64 bytes from 203.17.37.65: icmp_seq=566 ttl=255 time=719.223 ms
64 bytes from 203.17.37.65: icmp_seq=567 ttl=255 time=1.674 ms
64 bytes from 203.17.37.65: icmp_seq=568 ttl=255 time=1.918 ms

This kinda suggests to me that the ping packets are being transmitted
correctly - just the replies are not being received, and not being
802.11-ACKed to the AP, where they're being buffered until the card is
tickled correctly.  I just confirmed that at the receiver: it is still
responding to pings even if I don't see the replies on the laptop -
and the AP seems to be transmitting them, as seen from a ral(4).  To
see properly I'll have to get a capture from another machine with a
useful card in monitor mode next time, didn't have anything handy now.

I'll also happily capture and trawl through ath debug output, which I
can set from the sysctl readily enough, given an indication of which
debug bits are likely to be interesting..

The card is a DLink AirPremier DWL-AG660, which the atheros pages say
has an AR5004X chip, probed thus:

ath0 at cardbus1 function 0
ath0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mb
ps 54Mbps
ath0: turboG rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
ath0: mac 5.9 phy 4.3 radio 3.6

ath0: mac 00:0f:b5:38:af:13 bss 00:0f:b5:38:af:13
        node flags 0000
        ess <Geek>
        chan 40 freq 5200MHz flags 0140<ofdm,5GHz>
        capabilities 0001<ess>
        beacon-interval 100 TU tsft 12095897655 us
        rates *6.0 9.0 *12.0 18.0 *24.0 36.0 48.0 [54.0]
        assoc-id 0 assoc-failed 0 inactivity 240s
        rssi 55 txseq 0 rxseq 0
ath0: mac 00:0f:b5:38:af:14 bss 00:0f:b5:38:af:14
        node flags 0001<bss>
        ess <Geek>
        chan 6 freq 2437MHz flags 00f0<turbo,cck,ofdm,2.4GHz>
        capabilities 0431<ess,privacy,short preamble,short slot-time>
        beacon-interval 100 TU tsft 12171264385 us
        rates 1.0 2.0 5.5 6.0 9.0 11.0 12.0 18.0 24.0 36.0 [48.0] 54.0
        assoc-id 49154 assoc-failed 0 inactivity 300s
        rssi 49 txseq 211 rxseq 43488

--
Dan.
--dIV3CY5swYZsO92m
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (NetBSD)

iD8DBQFDVrIfEAVxvV4N66cRAoEoAKDEfToaFFcF+4XkdaZLOgOw+3PK7ACdH6wI
/ud6bciyAJuC0M30glA/SG4=
=/fiV
-----END PGP SIGNATURE-----

--dIV3CY5swYZsO92m--