On Jan 2, 2013, at 9:09 PM, Geoff Adams <gadams%avernus.com@localhost> wrote: > This all makes me suspect a mis-calculation of the hash codes, leading to > leaking NAT entries. Once I saw it, it was fairly obvious. ;) The attached patch fixes the ipf_nat_delete failures. The problem was that ipf_nat_delete wasn't swapping inbound and outbound hash codes for inbound NAT entries, so it was essentially always looking in the wrong buckets in those cases. But because of the way the linked list works, I don't think any NAT entries were actually leaked. But since all the bucket counters and chain count were getting messed up, things did start to go bad after a while. (New NAT entries wouldn't be created, for instance.) The fix is in the ipf_nat_delete function, itself; the other changes are a slight refactoring of one method and adding some comments that helped me figure out how the linked list with pointer-back-pointers worked. Also note that I haven't looked through the logic in ipf_nat_rehash; it's likely that that might miss some things for the same reason. I also haven't yet looked into the ipf_nat_newrdr problem with mappings already existing. That'll be next. - Geoff
Attachment:
ipf_nat_delete.patch
Description: Binary data