The work that's happening here looks like a scalability nightmare., regardless of holding the kernel lock or not. A couple of things: 1- It should do a better job of determining if there's any work that it actually has to do. As it is, it's going to lock the kernel_lock and traverse all of the buckets in a hash table even if there aren't any entries in it. Even in the NET_MPSAFE scenario, that's pointless work. 2- The kernel_lock, by its nature, is the lock-of-last-resort, and code needs to be doing everything it can to defer taking that lock until it is deemed absolutely necessary. Even in the not-NET_MPSAFE scenario, there should be a way to determine that IP fragment expiration processing needs to actually occur before taking the kernel_lock. -- thorpej
|