NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/44941: racoon droppes pfkey messages -> timeout



>Number:         44941
>Category:       bin
>Synopsis:       racoon droppes pfkey messages -> timeout
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri May 06 12:40:00 +0000 2011
>Originator:     Dr. Wolfgang Stukenbrock
>Release:        NetBSD 5.1
>Organization:
Dr. Nagler & Company GmbH
>Environment:
        
        
System: NetBSD e010 5.1 NetBSD 5.1 (NSW-svc-ISDN) #2: Thu May  5 13:12:45 CEST 
2011  
wgstuken@s012:/export/NetBSD-5.1/N+C-build/.OBJDIR_i386/export/NetBSD-5.1/src/sys/arch/i386/compile/NSW-svc-ISDN
 i386
Architecture: x86_64
Machine: amd64
>Description:
        While trying to connect a Windows7 client to my sytem I run into 
problems.
        The communications starts "normal" but during rekeying the connection 
suddenly
        hangs.
        I've turned on debugged and added some additional output to racoon in 
order to
        find the reason for this instability.
        I found teh following scenario:
        While adding a new SA into the kernel, racoon seems to get another 
message from the client.
        The message is a "delete" message - i this case a delete with 0 SPI 
entries.
        OK - this is not very smart from Windows7, but it triggers a bug in 
racoon.
        When this message is processed, pfkey_sadb_dump() is called. This 
routines send a
        dump request into the kernel and then retrieves messages until nothing 
is left.
        In my case it first drops tow messages (type = 2 and type = 3).
        This lead to a problem when the "running" add-SA tries to get it's 
answer messages from the kernel,
        because the dump has dropped it ...
        You can see the following messages in the output:
        "ERROR: 172.16.65.151 give up to get IPsec-SA due to time up to wait"
        After this the tunnel from the Windows7-client is dead.
>How-To-Repeat:
        Not easy, because it is very senseitive to timeing aspects ....
>Fix:
        The problem is located int 
/usr/src/crypto/dist/ipsec-tools/src/racoon/pfkey.c.
        In pfkey_sadb_dump() messages are dropped if they are not of type 
SADB_DUMP.
        Instead of dropping them, they should be forwareded as it is normaly 
done
        in pfkey_handler().
        Now there are several ways of dooing it ...
        I assume that there is no way to figure out when the lase messages that 
belongs to the dump
        request has been recieved. In this case there must be a polling in 
pfkey_sadb_dump() as
        currently implemented.
        So I recommend to split the pfkey_handler() routine into thow parts and 
reuse the lower part
        in pfkey_sadb_dump() to deliver the currently dropped messages.

        I'm not 100% shure if the PID should be checked in pfkey_handler() too 
- as done for the dump
        part. I'm not shure if delivering the messages is OK or if this may 
create some other sync-problems
        in the implementation and the delivery should be delayed.
        In this case a larger change is required with a queue to place these 
messages in it and force
        checking this queue at all places first where we wait for additional 
kernel messages from the pfkey-socket.

        For me I've added a direct delivery at the moment, and it seems to work.
        I got a "IPsec-SA established: ESP/Tunnel 
62.153.101.194[500]->172.16.65.151[500] spi=1221443210(0x48cdbe8a)"
        message in the middle of the dump processing and the connection is stil 
alive.

        The follwoing patch is the thing I've added to my 5.1 racoon:

--- pfkey.c     2011/05/06 11:16:17     1.1
+++ pfkey.c     2011/05/06 11:24:36
@@ -190,12 +190,13 @@
  *     0: success
  *     -1: fail
  */
+static int pfkey_handler_x(struct sadb_msg *msg);
+
 int
 pfkey_handler()
 {
        struct sadb_msg *msg;
        int len;
-       caddr_t mhp[SADB_EXT_MAX + 1];
        int error = -1;
 
        /* receive pfkey message. */
@@ -235,19 +236,32 @@
 
                goto end;
        }
+       if (pfkey_handler_x(msg) != 0) goto end;
+// remark: we assume that pfkey_align() will assign msg to mhp[0] in the 
separated code ....
+
+       error = 0;
+end:
+       if (msg)
+               racoon_free(msg);
+       return(error);
+}
+
+static int pfkey_handler_x(struct sadb_msg *msg)
+{
+       caddr_t mhp[SADB_EXT_MAX + 1];
 
        /* check pfkey message. */
        if (pfkey_align(msg, mhp)) {
                plog(LLV_ERROR, LOCATION, NULL,
                        "libipsec failed pfkey align (%s)\n",
                        ipsec_strerror());
-               goto end;
+               return -1;
        }
        if (pfkey_check(mhp)) {
                plog(LLV_ERROR, LOCATION, NULL,
                        "libipsec failed pfkey check (%s)\n",
                        ipsec_strerror());
-               goto end;
+               return -1;
        }
        msg = (struct sadb_msg *)mhp[0];
 
@@ -256,24 +270,20 @@
                plog(LLV_ERROR, LOCATION, NULL,
                        "unknown PF_KEY message type=%u\n",
                        msg->sadb_msg_type);
-               goto end;
+               return -1;
        }
 
        if (pkrecvf[msg->sadb_msg_type] == NULL) {
                plog(LLV_INFO, LOCATION, NULL,
                        "unsupported PF_KEY message %s\n",
                        s_pfkey_type(msg->sadb_msg_type));
-               goto end;
+               return -1;
        }
 
        if ((pkrecvf[msg->sadb_msg_type])(mhp) < 0)
-               goto end;
+               return -1;
 
-       error = 0;
-end:
-       if (msg)
-               racoon_free(msg);
-       return(error);
+       return(0);
 }
 
 /*
@@ -317,10 +327,17 @@
 
                if (msg->sadb_msg_type != SADB_DUMP || msg->sadb_msg_pid != pid)
                {
-                   plog(LLV_DEBUG, LOCATION, NULL,
-                        "discarding non-sadb dump msg %p, our pid=%i\n", msg, 
pid);
-                   plog(LLV_DEBUG, LOCATION, NULL,
-                        "type %i, pid %i\n", msg->sadb_msg_type, 
msg->sadb_msg_pid);
+                   if (msg->sadb_msg_pid == pid && pfkey_handler_x(msg) == 0)
+                   {
+                     plog(LLV_DEBUG, LOCATION, NULL,
+                          "successfull processed msg of type %i while 
collecting dump messages\n",
+                          msg->sadb_msg_type);
+                   } else {
+                       plog(LLV_DEBUG, LOCATION, NULL,
+                            "discarding non-sadb dump msg %p, our pid=%i\n", 
msg, pid);
+                       plog(LLV_DEBUG, LOCATION, NULL,
+                            "type %i, pid %i\n", msg->sadb_msg_type, 
msg->sadb_msg_pid);
+                   }
                    continue;
                }
                

>Unformatted:
        
        


Home | Main Index | Thread Index | Old Index