Subject: Re: IFQ_MAXLEN: How large can it be?
To: Christoph Kaegi <kgc@zhwin.ch>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: tech-net
Date: 11/16/2006 09:35:18
On Thu, 16 Nov 2006 08:44:32 +0100, Christoph Kaegi <kgc@zhwin.ch> wrote:

> On 15.11-10:48, Steven M. Bellovin wrote:
> > > 
> > > So I bumped this number on our quite busy firewall up from 256 
> > > to 1024 and later to 4096, but I still get 1'026'678 dropped 
> > > packets during 8 days uptime.
> > > 
> > It's far from clear to me that this is a big help.  There's a fair amount
> > of literature that says that too-large router queues are bad, since they
> > end up having many retransmissions of the same data.  I suggest that you
> > look at other resources -- CPU and output line rate come to mind -- and
> > start playing with some of the fancier queueing options on your output
> > link.  (I wonder -- it would be nice to be able to do RED on things like
> > the IP input queue.  Is that possible?)
> > 
> 
> What is "RED"? What do you mean bei "output line rate"?
> I wasn't aware I had queueing options on my output links.
> Did you mean ALTQ? Does that work?

"RED" is an output interface queueing discipline (it stands for, as I
recall, "Random Early Drop", though the altq.conf file says "Random Early
Detection"). Essentially, packets in an output queue are dropped with a
probability once the queue reaches a certain length.  Yes, that's right;
it doesnot simply drop newly-arriving packets when the queue has reached
its maximum size.  It seems counter-intuitive, but it works better than
the default tail-drop strategy, because it causes sending TCPs to back off
their retransmission timers more quickly and hence avoids duplicate copies
of packets in the queue.  (I was talking about ALTQ, but I don't see a way
to apply that to the IP input queue, which is too bad.)

By "output line rate", I was asking how fast the output size runs.  From
your later message about 'netstat -q', that's not the relevant question.
I think your CPU is too slow or too overloaded compared to your input line
speed.  What's happening is that packets are arriving faster than your IP
stack can process them.  The only issue is how long such bursts last;
ultimately, you're going to get packet drops if your CPU can't keep up
with the arrival rate.

Let me repeat what I said earlier: I think that having a 4K queue length
is not only not useful, it's quite possibly counter-productive.

You showed four active wm interfaces in a message.  How fast are they
running?  I know that some wm interfaces can run at gigabit speeds -- are
you trying to do that?  How fast is your CPU?  What do 'top' or
'sysstat' show for CPU idle percentage?

		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb