Subject: Re: ath(4) regression between 4.99.20 and .26
To: None <current-users@NetBSD.org>
From: Jukka Salmi <j+nbsd@2007.salmi.ch>
List: current-users
Date: 08/03/2007 16:08:39
Steven M. Bellovin --> current-users (2007-08-03 09:08:00 -0400):
> On Fri, 3 Aug 2007 14:42:44 +0200
> Jukka Salmi <j+nbsd@2007.salmi.ch> wrote:
> 
> > Hi,
> > 
> > since running 4.99.26 on my i386 laptop, the kernel logs
> > 
> > 	ath0: device timeout (txq 1)
> > 
> > every few seconds, and several times even lost connection (ifconfig(8)
> > reporting `status: no network'). Stopping wpa_supplicant, bringing the
> > interface down and up again and restarting wpa_supplicant seems to
> > reestablish connection so far.
> > 
> > I remember seeing some `ath0: device timeout' messages previously
> > (4.99.20), but fewer than now, and connection was never lost.
> > 
> > Anybody else seeing this? Any hints?
> > 
> Yes, I'm seeing it, too.  ath worked well for me on a 15 July kernel;
> the failures started (for me) with 4.99.25 built on 31 July.  I see
> this in the CVS log for ic/ath.c:
> 
> 
> revision 1.84
> date: 2007/07/17 01:26:17;  author: dyoung;  state: Exp;  lines: +22 -16
> Suppress spurious timeouts and avoid wedging in OACTIVE state:
> 
>         1 Set or clear OACTIVE as transmit buffers are depleted or
>         replenished, respectively.  Do not use 802.11 acknowledgements
>         as a criteria for clearing OACTIVE.
> 
>         2 Let each transmit queue count down to timeout independently,
>         and get rid of the shared countdown (sc_tx_timer).  When
>         we add a packet to a transmit queue, restart the queue's
>         countdown.  Stop a transmit queue's countdown when the
>         queue empties.
> 
> I haven't yet tried reverting it and ./ic/athvar.h.

Thanks for the hint. I just tried, and while the `ath0: device timeout
(txq 1)' messages are gone, disconnection still happens after a few
minutes.


Regards, Jukka

-- 
bashian roulette:
$ ((RANDOM%6)) || rm -rf ~