Bug in FAST_IPSEC

To: "tech-net%NetBSD.org@localhost" <tech-net%NetBSD.org@localhost>
Subject: Bug in FAST_IPSEC
From: Beverly Schwartz <bschwart%bbn.com@localhost>
Date: Thu, 4 Apr 2013 11:19:12 -0400

A tunneled packet, once stripped, is requeued by FAST_IPSEC in
sys/netipsec/ipsec_osdep.h in the function if_handoff.

The code, as it stands now, does this without taking KERNEL_LOCK.

Under heavy traffic going through a FAST_IPSEC tunnel, this code
conflicts with the regular use of the IP and IPv6 queues, and
mbufs (and their associated clusters) get permanently lost.

This is the change I propose:

--- a/netbsd/src/sys/netipsec/ipsec_osdep.h
+++ b/netbsd/src/sys/netipsec/ipsec_osdep.h
@@ -144,8 +144,10 @@ if_handoff(struct ifqueue *ifq, struct mbuf *m, struct 
ifnet *ifp, int adjust)

        int need_if_start = 0;
        int s = splnet();
 
+       KERNEL_LOCK(1, NULL);
        if (IF_QFULL(ifq)) {
                IF_DROP(ifq);
+               KERNEL_UNLOCK_ONE(NULL);
                splx(s);
                m_freem(m);
                return (0);
@@ -157,6 +159,7 @@ if_handoff(struct ifqueue *ifq, struct mbuf *m, struct 
ifnet *ifp, int adjust)
                need_if_start = !(ifp->if_flags & IFF_OACTIVE);
        }
        IF_ENQUEUE(ifq, m);
+       KERNEL_UNLOCK_ONE(NULL);
        if (need_if_start)
                (*ifp->if_start)(ifp);
        splx(s);

Two questions:
1. Do I need to do the unlock after (*ifp->if_start)(ifp)? Some drivers are
MP_SAFE, others are not.  We are using bnx, and all of the bnx code assumes
that KERNEL_LOCK is held when it is called.  (So far, I have not run into
trouble with the change as illustrated above, but I'm also not doing any
transport mode, which is when I would guess this code would be tickled.)
2. If ifp is not NULL, do I need to take KERNEL_LOCK at all?  (If I don't,
then question 1 goes away.)  If ifp is not NULL, then this packet is
going to an output queue rather than an input queue.  Some drivers are
MP_SAFE, some are not...




This bug is easy to reproduce using two hosts.  Set up a tunnel.  I've been
using ESP.  Have one endpoint wget a very large file from the other endpoint.
Repeat forever.  Then I run on each of the endpoints:

while [ 1 ]
do 
        date
        for i in 0 1 2 3 4 5 6 7 8 9
        do
                vmstat -m 2> /dev/null | fgrep mclpl
                sleep 30
        done
done

Go doing something else for an hour.  Then look at the output of the
above script.  You will see your mclpl gradually shrink.  After several
hours, you will start getting failures because too many clusters have
been lost.

-Bev
BBN Technologies
bschwart%bbn.com@localhost

Follow-Ups:
- Re: Bug in FAST_IPSEC
  - From: Christos Zoulas

Prev by Date: Re: Patch to implement SIOCGIFINDEX
Next by Date: Re: Increase tcp initial window
Previous by Thread: Patch to implement SIOCGIFINDEX
Next by Thread: Re: Bug in FAST_IPSEC
Indexes:

Home | Main Index | Thread Index | Old Index