Subject: Re: ath hickups ?
To: None <current-users@NetBSD.org>
From: David Young <dyoung@pobox.com>
List: current-users
Date: 06/09/2007 15:27:42
On Sat, Jun 09, 2007 at 10:16:46PM +0200, Frank Kardel wrote:
> David Young wrote:
> >On Sat, Jun 09, 2007 at 08:42:29PM +0200, Frank Kardel wrote:
> >  
> >>Hi *,
> >>
> >>I am seeing quite a few device timeout errors with my ath0 device in 
> >>-current
> >>===
> >>ath0: interrupting at ioapic0 pin 16 (irq 11)
> >>ath0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
> >>ath0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 
> >>24Mbps 36Mbps 48Mbps 54Mbps
> >>ath0: mac 5.9 phy 4.3 radio 4.6
> >>
> >>00:09.0 Ethernet controller: Atheros Communications, Inc. AR5212 
> >>802.11abg NIC (rev 01)
> >>       Subsystem: D-Link System Inc D-Link AirPlus DWL-G520 Wireless 
> >>PCI Adapter(rev.B)
> >>       Flags: bus master, medium devsel, latency 80, IRQ 11
> >>       Memory at fb200000 (32-bit, non-prefetchable)
> >>       Capabilities: [44] Power Management version 2
> >>===
> >>
> >>I seem to remember that there where times where ath0 was working more 
> >>reliably with NetBSD -
> >>can anyone share this observation ?
> >>    
> >
> >Sorry, my bad.  Looks like I introduced a new bug as I repaired another.
> >
> >The problem is this: roughly speaking, ath_tx_processq() returns the
> >number of transmissions acknowledged by the receiver.  It does not return
> >the number of transmit descriptors that the NIC is finished with, as I
> >had assumed.  So if your NIC has sent only multicast traffic, which does
> >not require an 802.11 Acknowledgement, then ath_tx_processq() will always
> >be 0.  So ath_tx_processq's callers are going to think the NIC is not
> >finishing any descriptors, when really the NIC is.  Two things will go
> >wrong: first, ath will exhaust its descriptors, stalling transmissions.
> >Finally, the driver will countdown sc_tx_timer = 5, 4, ..., 0, and
> >then timeout.  Timing out resets the h/w and drains the transmit rings,
> >which is correct, but drastic; it is not going to help your traffic
> >flow smoothly.
> >
> >Thanks Mindaugus for prodding me to give this a look.
> >
> >Please give this patch a shot.
> >  
> Thanks for the quick reply - test is underway.
> 
> Another observation was that the llinfo route entries react a bit funny 
> I often
> get many errors like this:
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> ath0: device timeout
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> arpresolve: can't allocate llinfo on ath0 for 192.168.200.1
> 
> When this happens the network route entry often gets lost and
> packets expected to go to ath0 manage to find other interfaces
> than ath0.

I believe that the lost route entry is the cause of the arpresolve
warnings.  When this happens, what do these say?

route -n get 192.168.200.0/24
route -n get 192.168.200.1

What kind of bridging/routing/filtering is active on this box?

Dave

> >Dave
> >
> >  
> Frank

-- 
David Young             OJC Technologies
dyoung@ojctech.com      Urbana, IL * (217) 278-3933 ext 24