NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
kern/57574: workqueue_wait use-after-free
>Number: 57574
>Category: kern
>Synopsis: workqueue_wait use-after-free
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Aug 09 07:05:00 +0000 2023
>Originator: Taylor R Campbell
>Release: current, 10, 9, 8
>Organization:
The NetBSD Workqueue
>Environment:
>Description:
1. workqueue_enqueue puts a struct work on q_queue_pending.
2. workqueue_worker transfers a batch of work from q_queue_pending to q_queue_running, and then calls workqueue_runlist without the lock.
3. workqueue_runlist calls the workqueue function on each struct work in the queue -- and, although it doesn't _modify_ anything in q_queue_running, it must not touch the struct work after calling the function, because...
4. The workqueue function is allowed to free the structure containing the struct work.
5. workqueue_wait searches through both q_queue_pending and q_queue_running for the matching struct work.
However, the first struct work in q_queue_running may have been freed by the time workqueue_wait touches it.
>How-To-Repeat:
code inspection
>Fix:
Some options:
(a) Add lock/unlock cycles in workqueue_runlist to keep q_queue_running updated under the lock, and do something to keep a pointer to the current struct work -- not safe to dereference to find the next entry in the list, but safe to compare in workqueue_wait. However, presumably avoiding these lock/unlock cycles on each struct work was an intentional part of the workqueue(9) implementation.
(b) Add a generation number to each queue, incremented each time workqueue_runlist finishes a batch of work, and a flag indicating whether workqueue_runlist is in progress. workqueue_wait must 1. wait until the struct work is not in q_queue_pending, and then 2. if workqueue_runlist is in progress, wait until the generation number changes. If workqueue_wait found the struct work in step (1), we can skip it on all other CPUs. This might raise the cost of workqueue_wait (waiting for more work to finish than it strictly needs to), but workqueue_wait is a slow path on teardown; this avoids raising the cost in the fast path of running work.
Home |
Main Index |
Thread Index |
Old Index