tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Likely lock botch in NPF firewall



> Date: Wed, 11 Jan 2023 13:05:04 -0500
> From: Brad Spencer <brad%anduin.eldar.org@localhost>
> 
> I think I know what the root problem is for kern/57136 and
> kern/57181... a couple of PRs I wrote about problems I have been having
> with NPF, but I am not at all sure that my solution is correct.

These are the same issue.  Your analysis is correct that this happens
because the spin lock t_lock is held across copyout, which is
forbidden, because copyout might sleep indefinitely and sleeping at
all (let alone indefinitely) under a spin lock is forbidden.

The kernel with LOCKDEBUG just goes out of its way to detect the
problem early, whereas the non-LOCKDEBUG kernel mostly does the
copyout without sleeping so it gets by without triggering the panic.

However, simply changing the lock to be IPL_SOFTNET isn't enough.
Holding an IPL_SOFTNET lock across copyout is also forbidden (but
LOCKDEBUG might not detect it).  It is forbidden for anything in hard
or soft interrupt context to block for copyout (which might be
happening in another thread).

This code needs to coordinate the table iteration and copyout with
other access (insert/remove/lookup) in another way.  Probably the
easiest way will be to create an rwlock, and use t_gc for lpm-type
tables in addition to thmap-type tables:

- npf_table_remove moves lpm-type entries to the t_gc list instead of
  freeing them immediately
- npf_table_gc takes a write lock and frees the t_gc entries
- npf_table_list takes a read lock (and pserialize if needed to grab
  pointers to entries) around copyout -- never takes t_lock

This could still potentially result in deadlock scenarios, if
npf_table_gc is ever reached from the pagedaemon; if so, npf_table_gc
could do rw_tryenter instead of rw_enter to avoid this deadlock, at
the cost of sometimes delaying the freeing of table entries.


Home | Main Index | Thread Index | Old Index