Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: IPFilter issue in -current



Geoff,

On 5/01/2013 4:30 PM, Geoff Adams wrote:
> On Jan 2, 2013, at 9:09 PM, Geoff Adams <gadams%avernus.com@localhost> wrote:
> 
>> This all makes me suspect a mis-calculation of the hash codes, leading to 
>> leaking NAT entries.
> 
> Once I saw it, it was fairly obvious. ;)
> 
> The attached patch fixes the ipf_nat_delete failures. The problem was that 
> ipf_nat_delete
> wasn't swapping inbound and outbound hash codes for inbound NAT entries, so 
> it was essentially
> always looking in the wrong buckets in those cases. But because of the way 
> the linked list
> works, I don't think any NAT entries were actually leaked. But since all the 
> bucket counters
> and chain count were getting messed up, things did start to go bad after a 
> while. (New NAT
> entries wouldn't be created, for instance.)
> 
> The fix is in the ipf_nat_delete function, itself; the other changes are a 
> slight refactoring
> of one method and adding some comments that helped me figure out how the 
> linked list with
> pointer-back-pointers worked.

Thank you for the analysis, that really helps.

My thoughts are that the patch should be somewhat different and rather than add
more code that does the hash swapping, just perform that check once when 
nat_hv[0]
and nat_hv[1] are assigned. This then means that ipf_nat_delete() and other
functions do not need to know about that complexity. Additionally, the NAT code
for IPv6 also needs to be fixed in this area.

If you can, please try out the attached patch.

Cheers,
Darren

Index: ip_nat.c
===================================================================
RCS file: /devel/CVS/IP-Filter/ip_nat.c,v
retrieving revision 2.357.2.41
diff -u -r2.357.2.41 ip_nat.c
--- ip_nat.c    25 Aug 2012 14:44:42 -0000      2.357.2.41
+++ ip_nat.c    6 Jan 2013 06:32:17 -0000
@@ -2322,6 +2322,7 @@
 
                bkt = nat->nat_hv[0] % softn->ipf_nat_table_sz;
                nss = &softn->ipf_nat_stats.ns_side[0];
+               ASSERT(nss->ns_bucketlen[bkt] > 0);
                nss->ns_bucketlen[bkt]--;
                if (nss->ns_bucketlen[bkt] == 0) {
                        nss->ns_inuse--;
@@ -2329,6 +2330,7 @@
 
                bkt = nat->nat_hv[1] % softn->ipf_nat_table_sz;
                nss = &softn->ipf_nat_stats.ns_side[1];
+               ASSERT(nss->ns_bucketlen[bkt] > 0);
                nss->ns_bucketlen[bkt]--;
                if (nss->ns_bucketlen[bkt] == 0) {
                        nss->ns_inuse--;
@@ -3471,8 +3473,13 @@
                /* TRACE nat_nsrcaddr, nat_ndstaddr, hv1 */
        }
 
-       nat->nat_hv[0] = hv0;
-       nat->nat_hv[1] = hv1;
+       if ((nat->nat_dir & NAT_OUTBOUND) == NAT_OUTBOUND) {
+               nat->nat_hv[0] = hv0;
+               nat->nat_hv[1] = hv1;
+       } else {
+               nat->nat_hv[0] = hv1;
+               nat->nat_hv[1] = hv0;
+       }
 
        MUTEX_INIT(&nat->nat_lock, "nat entry lock");
 
@@ -3510,9 +3517,11 @@
 
 /* ------------------------------------------------------------------------ */
 /* Function:    ipf_nat_hashtab_add                                         */
+/* Returns:     int - 0 == sucess, -1 == failure                            */
 /* Parameters:  softc(I) - pointer to soft context main structure           */
 /*              softn(I) - pointer to NAT context structure                 */
 /*              nat(I) - pointer to NAT structure                           */
+/* Write Lock:  ipf_nat                                                     */
 /*                                                                          */
 /* Handle the insertion of a NAT entry into the table/list.                 */
 /* ------------------------------------------------------------------------ */
@@ -3529,14 +3538,6 @@
        hv0 = nat->nat_hv[0] % softn->ipf_nat_table_sz;
        hv1 = nat->nat_hv[1] % softn->ipf_nat_table_sz;
 
-       if (nat->nat_dir == NAT_INBOUND || nat->nat_dir == NAT_DIVERTIN) {
-               u_int swap;
-
-               swap = hv0;
-               hv0 = hv1;
-               hv1 = swap;
-       }
-
        if (softn->ipf_nat_stats.ns_side[0].ns_bucketlen[hv0] >=
            softn->ipf_nat_maxbucket) {
                DT1(ns_bucket_max_0, int,
@@ -4341,14 +4342,17 @@
        if (nat->nat_hnext[0])
                nat->nat_hnext[0]->nat_phnext[0] = nat->nat_phnext[0];
        *nat->nat_phnext[0] = nat->nat_hnext[0];
-       nsp->ns_side[0].ns_bucketlen[nat->nat_hv[0] %
-                                    softn->ipf_nat_table_sz]--;
+       hv0 = nat->nat_hv[0] % softn->ipf_nat_table_sz;
+       hv1 = nat->nat_hv[1] % softn->ipf_nat_table_sz;
+
+       ASSERT(nsp->ns_side[0].ns_bucketlen[hv0] > 0);
+       nsp->ns_side[0].ns_bucketlen[hv0]--;
 
        if (nat->nat_hnext[1])
                nat->nat_hnext[1]->nat_phnext[1] = nat->nat_phnext[1];
        *nat->nat_phnext[1] = nat->nat_hnext[1];
-       nsp->ns_side[1].ns_bucketlen[nat->nat_hv[1] %
-                                    softn->ipf_nat_table_sz]--;
+       ASSERT(nsp->ns_side[1].ns_bucketlen[hv1] > 0);
+       nsp->ns_side[1].ns_bucketlen[hv1]--;
 
        /*
         * Add into the NAT table in the new position
@@ -4360,21 +4364,20 @@
        rhv1 = NAT_HASH_FN(nat->nat_ndstaddr, rhv1 + nat->nat_ndport,
                           0xffffffff);
 
-       hv0 = rhv0 % softn->ipf_nat_table_sz;
-       hv1 = rhv1 % softn->ipf_nat_table_sz;
-
-       if (nat->nat_dir == NAT_INBOUND || nat->nat_dir == NAT_DIVERTIN) {
-               u_int swap;
-
-               swap = hv0;
-               hv0 = hv1;
-               hv1 = swap;
+       if ((nat->nat_dir & NAT_OUTBOUND) == NAT_OUTBOUND) {
+               nat->nat_hv[0] = rhv0;
+               nat->nat_hv[1] = rhv1;
+       } else {
+               nat->nat_hv[0] = rhv1;
+               nat->nat_hv[1] = rhv0;
        }
 
+       hv0 = nat->nat_hv[0] % softn->ipf_nat_table_sz;
+       hv1 = nat->nat_hv[1] % softn->ipf_nat_table_sz;
+
        /* TRACE nat_osrcaddr, nat_osport, nat_odstaddr, nat_odport, hv0 */
        /* TRACE nat_nsrcaddr, nat_nsport, nat_ndstaddr, nat_ndport, hv1 */
 
-       nat->nat_hv[0] = rhv0;
        natp = &softn->ipf_nat_table[0][hv0];
        if (*natp)
                (*natp)->nat_phnext[0] = &nat->nat_hnext[0];
@@ -4383,7 +4386,6 @@
        *natp = nat;
        nsp->ns_side[0].ns_bucketlen[hv0]++;
 
-       nat->nat_hv[1] = rhv1;
        natp = &softn->ipf_nat_table[1][hv1];
        if (*natp)
                (*natp)->nat_phnext[1] = &nat->nat_hnext[1];
Index: ip_nat.h
===================================================================
RCS file: /devel/CVS/IP-Filter/ip_nat.h,v
retrieving revision 2.143.2.12
diff -u -r2.143.2.12 ip_nat.h
--- ip_nat.h    25 Aug 2012 14:44:42 -0000      2.143.2.12
+++ ip_nat.h    6 Jan 2013 13:11:34 -0000
@@ -88,6 +88,21 @@
 /*
  * This structure is used in the active NAT table and represents an
  * active NAT session.
+ *
+ * Generally nat_t structures have references from at least two places.
+ * The first place gives them a position in a linked list of NAT sessions
+ * per instace of IPFilter. In this linked list, nat_next always points to
+ * the next entry in the list and nat_pnext points to the pointer that
+ * introduces the structure. That may be either the top of the list pointer
+ * or simply the nat_next of the previous link in the list. The second place
+ * that a nat_t structure is generally referenced from is the NAT hash table.
+ * Two references from this table are required, one for supporting the of
+ * matching packets being transmitted and one for supporting the matching of
+ * packets being received. The hash table is comprised of buckets, each one
+ * having its own chain of nat_t structures. To support these chains,
+ * nat_hnext is used to point to the next member of the chain and nat_phnext
+ * points back to the pointer that is pointing to the nat_t in the chain,
+ * be it the bucket at the top or simply the previous nat_t chain entry.
  */
 typedef        struct  nat     {
        ipfmutex_t      nat_lock;
Index: ip_nat6.c
===================================================================
RCS file: /devel/CVS/IP-Filter/ip_nat6.c,v
retrieving revision 2.20.2.21
diff -u -r2.20.2.21 ip_nat6.c
--- ip_nat6.c   25 Aug 2012 14:44:42 -0000      2.20.2.21
+++ ip_nat6.c   6 Jan 2013 12:14:41 -0000
@@ -1285,7 +1285,7 @@
        ipf_nat_softc_t *softn;
        nat_t *nat;
 {
-       u_int hv1, hv2;
+       u_int hv0, hv1;
        u_32_t sp, dp;
        ipnat_t *in;
 
@@ -1304,13 +1304,13 @@
                        sp = 0;
                        dp = 0;
                }
-               hv1 = NAT_HASH_FN6(&nat->nat_osrc6, sp, 0xffffffff);
-               hv1 = NAT_HASH_FN6(&nat->nat_odst6, hv1 + dp,
+               hv0 = NAT_HASH_FN6(&nat->nat_osrc6, sp, 0xffffffff);
+               hv0 = NAT_HASH_FN6(&nat->nat_odst6, hv0 + dp,
                                   softn->ipf_nat_table_sz);
 
                /*
                 * TRACE nat6_osrc6, nat6_osport, nat6_odst6,
-                * nat6_odport, hv1
+                * nat6_odport, hv0
                 */
 
                if ((nat->nat_flags & IPN_TCPUDP) != 0) {
@@ -1323,27 +1323,32 @@
                        sp = 0;
                        dp = 0;
                }
-               hv2 = NAT_HASH_FN6(&nat->nat_nsrc6, sp, 0xffffffff);
-               hv2 = NAT_HASH_FN6(&nat->nat_ndst6, hv2 + dp,
+               hv1 = NAT_HASH_FN6(&nat->nat_nsrc6, sp, 0xffffffff);
+               hv1 = NAT_HASH_FN6(&nat->nat_ndst6, hv1 + dp,
                                   softn->ipf_nat_table_sz);
                /*
                 * TRACE nat6_nsrcaddr, nat6_nsport, nat6_ndstaddr,
-                * nat6_ndport, hv1
+                * nat6_ndport, hv0
                 */
        } else {
-               hv1 = NAT_HASH_FN6(&nat->nat_osrc6, 0, 0xffffffff);
-               hv1 = NAT_HASH_FN6(&nat->nat_odst6, hv1,
+               hv0 = NAT_HASH_FN6(&nat->nat_osrc6, 0, 0xffffffff);
+               hv0 = NAT_HASH_FN6(&nat->nat_odst6, hv0,
                                   softn->ipf_nat_table_sz);
-               /* TRACE nat6_osrcip6, nat6_odstip6, hv1 */
+               /* TRACE nat6_osrcip6, nat6_odstip6, hv0 */
 
-               hv2 = NAT_HASH_FN6(&nat->nat_nsrc6, 0, 0xffffffff);
-               hv2 = NAT_HASH_FN6(&nat->nat_ndst6, hv2,
+               hv1 = NAT_HASH_FN6(&nat->nat_nsrc6, 0, 0xffffffff);
+               hv1 = NAT_HASH_FN6(&nat->nat_ndst6, hv1,
                                   softn->ipf_nat_table_sz);
-               /* TRACE nat6_nsrcip6, nat6_ndstip6, hv2 */
+               /* TRACE nat6_nsrcip6, nat6_ndstip6, hv1 */
        }
 
-       nat->nat_hv[0] = hv1;
-       nat->nat_hv[1] = hv2;
+       if ((nat->nat_dir & NAT_OUTBOUND) == NAT_OUTBOUND) {
+               nat->nat_hv[0] = hv0;
+               nat->nat_hv[1] = hv1;
+       } else {
+               nat->nat_hv[0] = hv1;
+               nat->nat_hv[1] = hv0;
+       }
 
        MUTEX_INIT(&nat->nat_lock, "nat entry lock");
 
@@ -2137,8 +2142,8 @@
        ipf_nat_softc_t *softn;
        nat_t *nat;
 {
+       u_int rhv0, rhv1, hv0, hv1;
        nat_t **natp;
-       u_int hv0, hv1;
 
        if (nat->nat_flags & SI_CLONE)
                return;
@@ -2159,25 +2164,27 @@
        /*
         * Add into the NAT table in the new position
         */
-       hv0 = NAT_HASH_FN6(&nat->nat_osrc6, nat->nat_osport, 0xffffffff);
-       hv0 = NAT_HASH_FN6(&nat->nat_odst6, hv0 + nat->nat_odport,
-                          softn->ipf_nat_table_sz);
-       hv1 = NAT_HASH_FN6(&nat->nat_nsrc6, nat->nat_nsport, 0xffffffff);
-       hv1 = NAT_HASH_FN6(&nat->nat_ndst6, hv1 + nat->nat_ndport,
-                          softn->ipf_nat_table_sz);
-
-       if (nat->nat_dir == NAT_INBOUND || nat->nat_dir == NAT_DIVERTIN) {
-               u_int swap;
-
-               swap = hv0;
-               hv0 = hv1;
-               hv1 = swap;
+       rhv0 = NAT_HASH_FN6(&nat->nat_osrc6, nat->nat_osport, 0xffffffff);
+       rhv0 = NAT_HASH_FN6(&nat->nat_odst6, rhv0 + nat->nat_odport,
+                           softn->ipf_nat_table_sz);
+       rhv1 = NAT_HASH_FN6(&nat->nat_nsrc6, nat->nat_nsport, 0xffffffff);
+       rhv1 = NAT_HASH_FN6(&nat->nat_ndst6, rhv1 + nat->nat_ndport,
+                           softn->ipf_nat_table_sz);
+
+       if ((nat->nat_dir & NAT_OUTBOUND) == NAT_OUTBOUND) {
+               nat->nat_hv[0] = rhv0;
+               nat->nat_hv[1] = rhv1;
+       } else {
+               nat->nat_hv[0] = rhv1;
+               nat->nat_hv[1] = rhv0;
        }
 
+       hv0 = nat->nat_hv[0] % softn->ipf_nat_table_sz;
+        hv1 = nat->nat_hv[1] % softn->ipf_nat_table_sz;
+
        /* TRACE nat_osrc6, nat_osport, nat_odst6, nat_odport, hv0 */
        /* TRACE nat_nsrc6, nat_nsport, nat_ndst6, nat_ndport, hv1 */
 
-       nat->nat_hv[0] = hv0;
        natp = &softn->ipf_nat_table[0][hv0];
        if (*natp)
                (*natp)->nat_phnext[0] = &nat->nat_hnext[0];
@@ -2186,7 +2193,6 @@
        *natp = nat;
        softn->ipf_nat_stats.ns_side[0].ns_bucketlen[hv0]++;
 
-       nat->nat_hv[1] = hv1;
        natp = &softn->ipf_nat_table[1][hv1];
        if (*natp)
                (*natp)->nat_phnext[1] = &nat->nat_hnext[1];
Index: ip_state.c
===================================================================
RCS file: /devel/CVS/IP-Filter/ip_state.c,v
retrieving revision 2.312.2.25
diff -u -r2.312.2.25 ip_state.c
--- ip_state.c  13 Aug 2012 11:39:52 -0000      2.312.2.25
+++ ip_state.c  6 Jan 2013 06:35:03 -0000
@@ -3645,10 +3645,10 @@
                is->is_me = NULL;
                is->is_ref--;
        }
-       if (is->is_ref > 1) {
+       is->is_ref--;
+       if (is->is_ref > 0) {
                int refs;
 
-               is->is_ref--;
                refs = is->is_ref;
                MUTEX_EXIT(&is->is_lock);
                if (!orphan)
@@ -3666,7 +3666,7 @@
                }
        }
 
-       is->is_ref = 0;
+       ASSERT(is->is_ref == 0);
 
        if (is->is_tqehead[0] != NULL) {
                if (ipf_deletetimeoutqueue(is->is_tqehead[0]) == 0)


Home | Main Index | Thread Index | Old Index