tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: rbuf starvation in the iwn driver

On Mon April 5 2010 16:19:19 David Young wrote:
> On Mon, Apr 05, 2010 at 08:38:41AM -0600, Sverre Froyen wrote:
> > Monitoring the number-of-free-rbufs counter during network traffic, I
> > find that it normally stays at 32, occasionally dropping into the
> > twenties. Sometimes, however, the count will abruptly jump to zero. At
> > this point, the free count does not recover but remains at zero for a
> > *long* time. The interface does not receive any packets as long as the
> > driver has no free rbufs. After about ten minutes, I see a flurry of
> > calls to iwn_free_rbuf and the free count returns to 32. At this point
> > the interface is working properly again.
> During the flurry of calls to iwn_free_rbuf(), can you get a backtrace
> in iwn_free_rbuf()?  I hope that will show us what mechanism frees them
> in a flurry.

Here is a trace from the first call to iwn_free_rbuf after the interface has 
been locked up for ~10 mins.

DDB lost frame for netbsd:Xsoftintr

This looks like some type of timeout. For what it's worth, I had quit the 
applications that I was using to trigger the lock-up long before the call to 
iwn_free_rbuf (although there were additional programs with network 
connections open at the time of the call, ntpd, apache and openvpn come to 

In the process of collecting the above trace, I added a call to panic if 
iwn_free_rbuf was called with free buffer count of zero. It turns out this 
happens rather quickly (long before the interface locks up). Here is the trace 
from a non-locked-up call:



PS I received the following comment from Damien Bergamini:

>This sounds similar to a bug that was fixed in OpenBSD ~3 years
>ago (wpi(4) rev 1.51):
>You should look at NetBSD's wpi(4) as it seems to have this issue
>fixed too (using m_dup).  I have no idea why it has not been
>backported to NetBSD's iwn(4) though.

It looks like the NetBSD wpi changes he refers to must be these:

revision 1.10
date: 2007/06/18 19:40:49;  author: degroote;  state: Exp;  lines: +37 -19
Add a workaround in the case where we have low number of rbuf.
It seems to fix problem of frozen network with wpi.

Looking through the if_iwn.c revisions it looks like the iwn driver had the 
m_dup code until rev 1.33 (when iwn_rx_intr was replaced by iwn_rx_done). I'll 
see if I can reintegrate the code.

PPS I used panic(9) to get the traces. Is it safe to "continue" from such a 
diagnostic panic?

Home | Main Index | Thread Index | Old Index