tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [patch] bug fix & TCP networking performance improvements



On Thu, Apr 14, 2011 at 10:44:36PM +0200, Matthias Drochner wrote:
> 
> dyoung%pobox.com@localhost said:
> > When a protocol-drain routines called in an interrupt context sleeps,
> > the system deadlocks.
> 
> What do you mean exactly here? Sleeping in an interrupt context is
> a hard error, not just a deadlock. Or is this about a deadlock
> at a more functional level, where some memory might need to be
> allocated short-term to be able to free a bigger chunk?

I think David meant to say "would sleep".  We independently fixed
the "pool allocator can cause map reclaim which can sleep from interrupt
context" in our tree -- I believe that does not appear in this patch
because it was almost simultaneously fixed in NetBSD-current -- but
the result was that the system could lock up, unable to get mbufs
while holding the softint lock.

Something's got to drain the protocols in low-memory conditions, and,
before, it was those drain routines called in a way that potentially
had locks held.  Fixing that left them effectively never called at
all.  Stealing the fasttimo was a quick and reasonably effective
way of ensuring that the drain routines were ever called at all.

I think at least one other developer has a change in his tree that
uses a separate callout or kthread to do the drains.  That's better
if and when it shows up in NetBSD-current but I think this interim
fix is good enough to go in.

> I can't comment on the VTW implementation itself, just some
> comments on mostly formal things:
> 
> > lower MSLs by default: 2 seconds for loopback
>
> This looks short. In particular with filesystem pressure my boxes
> are often unresponsive for significantly longer. There is no
> justification for the big difference to

I strongly disagree.  Applications that make large numbers of local
connections benefit dramatically from this change, and if the kernel
cannot provide a fin-ack for a matter of *seconds* there is a bug
elsewhere in the kernel.

The 2MSL rule is meant to give packets time to be in-flight in the
network, not to accomodate buggy stacks -- not even local ones.

At least, that is how I see it.

> Some "#ifdef INET6" and so seem to be missing to reduce bloat.
> 
> There is some change about permissions for raw sockets which
> is not explained.

Eep.  This is likely a part of a completely different change  which
leaked in by mistake.  It won't work without security model changes
we cannot currently contribute back.  Thank you for pointing it out!

Thor


Home | Main Index | Thread Index | Old Index