tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]


A tunneled packet, once stripped, is requeued by FAST_IPSEC in
sys/netipsec/ipsec_osdep.h in the function if_handoff.

The code, as it stands now, does this without taking KERNEL_LOCK.

Under heavy traffic going through a FAST_IPSEC tunnel, this code
conflicts with the regular use of the IP and IPv6 queues, and
mbufs (and their associated clusters) get permanently lost.

This is the change I propose:

--- a/netbsd/src/sys/netipsec/ipsec_osdep.h
+++ b/netbsd/src/sys/netipsec/ipsec_osdep.h
@@ -144,8 +144,10 @@ if_handoff(struct ifqueue *ifq, struct mbuf *m, struct 
ifnet *ifp, int adjust)

        int need_if_start = 0;
        int s = splnet();
+       KERNEL_LOCK(1, NULL);
        if (IF_QFULL(ifq)) {
+               KERNEL_UNLOCK_ONE(NULL);
                return (0);
@@ -157,6 +159,7 @@ if_handoff(struct ifqueue *ifq, struct mbuf *m, struct 
ifnet *ifp, int adjust)
                need_if_start = !(ifp->if_flags & IFF_OACTIVE);
        IF_ENQUEUE(ifq, m);
        if (need_if_start)

Two questions:
1. Do I need to do the unlock after (*ifp->if_start)(ifp)? Some drivers are
MP_SAFE, others are not.  We are using bnx, and all of the bnx code assumes
that KERNEL_LOCK is held when it is called.  (So far, I have not run into
trouble with the change as illustrated above, but I'm also not doing any
transport mode, which is when I would guess this code would be tickled.)
2. If ifp is not NULL, do I need to take KERNEL_LOCK at all?  (If I don't,
then question 1 goes away.)  If ifp is not NULL, then this packet is
going to an output queue rather than an input queue.  Some drivers are
MP_SAFE, some are not...

This bug is easy to reproduce using two hosts.  Set up a tunnel.  I've been
using ESP.  Have one endpoint wget a very large file from the other endpoint.
Repeat forever.  Then I run on each of the endpoints:

while [ 1 ]
        for i in 0 1 2 3 4 5 6 7 8 9
                vmstat -m 2> /dev/null | fgrep mclpl
                sleep 30

Go doing something else for an hour.  Then look at the output of the
above script.  You will see your mclpl gradually shrink.  After several
hours, you will start getting failures because too many clusters have
been lost.

BBN Technologies

Home | Main Index | Thread Index | Old Index