Subject: Re: wi0 going mute?
To: None <current-users@NetBSD.org>
From: David Young <email@example.com>
Date: 01/25/2004 20:24:47
On Sun, Jan 25, 2004 at 02:01:30PM -0600, David Young wrote:
> On Sun, Jan 25, 2004 at 11:21:36PM +1100, Paul Ripke wrote:
> > With -current kernel from around 20040119, userland somewhat more
> > ancient
> > (about 200304??), after giving my system some work, like building a
> > release,
> > its wireless interface has a nasty habit of going mute. Packets are
> > still
> > being received, but nothing sent.
> I have seen this, too. I think it is a bug in the rate adaptation code. I
> don't see what it would have to do with building a release, unless the
> kernel misses interrupts under those conditions....
> > thing I've noticed is that the kernel prints a couple of "wi0: bad idx
> > 6e"
> Ok, the firmware should "never" return a bad index, but obviously I
> should not trust that. Bug noted.
I am going to be really busy for another couple of weeks. Just in
case somebody wants to fix the bug I have mentally noted, here it is.
A rate-adaptation descriptor (sc->sc_rssd) will be "leaked" whenever
there a "wi0: bad idx %02x" error occurs in wi_tx_intr or wi_tx_ex_intr.
When they have run out, no more packets will be transmitted, because
wi_start wants for a descriptor to be free before it transmits a packet.
When exceptions like "bad idx" occur, both wi_tx_intr and wi_tx_ex_intr
should free any (every?) busy descriptor.
Maybe there should also be a watchdog on each descriptor, just in case
one does not get reclaimed.
BTW, the descriptor leak needs solving regardless of whether there is
an endianness bug, also:
> Hmm. This could be an endianness issue. What architecture is this? I am
> betting that it is big-endian. 0x6e (110) is Prism's representation for
> 11Mbps. The data rate and the index both go into the same 16-bit word,
> but they are u_int8_t's in the wi_frame struct.
David Young OJC Technologies
firstname.lastname@example.org Urbana, IL * (217) 278-3933