Subject: Re: ath hickups ?
To: None <current-users@netbsd.org, kardel@netbsd.org, briggs@netbsd.org>
From: Frank Kardel <kardel@netbsd.org>
List: current-users
Date: 06/09/2007 22:16:46
David Young wrote:
> On Sat, Jun 09, 2007 at 08:42:29PM +0200, Frank Kardel wrote:
>   
>> Hi *,
>>
>> I am seeing quite a few device timeout errors with my ath0 device in 
>> -current
>> ===
>> ath0: interrupting at ioapic0 pin 16 (irq 11)
>> ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
>> ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 
>> 24Mbps 36Mbps 48Mbps 54Mbps
>> ath0: mac 5.9 phy 4.3 radio 4.6
>>
>> 00:09.0 Ethernet controller: Atheros Communications, Inc. AR5212 
>> 802.11abg NIC (rev 01)
>>        Subsystem: D-Link System Inc D-Link AirPlus DWL-G520 Wireless 
>> PCI Adapter(rev.B)
>>        Flags: bus master, medium devsel, latency 80, IRQ 11
>>        Memory at fb200000 (32-bit, non-prefetchable)
>>        Capabilities: [44] Power Management version 2
>> ===
>>
>> I seem to remember that there where times where ath0 was working more 
>> reliably with NetBSD -
>> can anyone share this observation ?
>>     
>
> Sorry, my bad.  Looks like I introduced a new bug as I repaired another.
>
> The problem is this: roughly speaking, ath_tx_processq() returns the
> number of transmissions acknowledged by the receiver.  It does not return
> the number of transmit descriptors that the NIC is finished with, as I
> had assumed.  So if your NIC has sent only multicast traffic, which does
> not require an 802.11 Acknowledgement, then ath_tx_processq() will always
> be 0.  So ath_tx_processq's callers are going to think the NIC is not
> finishing any descriptors, when really the NIC is.  Two things will go
> wrong: first, ath will exhaust its descriptors, stalling transmissions.
> Finally, the driver will countdown sc_tx_timer = 5, 4, ..., 0, and
> then timeout.  Timing out resets the h/w and drains the transmit rings,
> which is correct, but drastic; it is not going to help your traffic
> flow smoothly.
>
> Thanks Mindaugus for prodding me to give this a look.
>
> Please give this patch a shot.
>   
Thanks for the quick reply - test is underway.

Another observation was that the llinfo route entries react a bit funny 
I often
get many errors like this:
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
ath0: device timeout
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
arpresolve: can't allocate llinfo on ath0 for 192.168.200.1

When this happens the network route entry often gets lost and
packets expected to go to ath0 manage to find other interfaces
than ath0.
> Dave
>
>   
Frank