NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/57574: workqueue_wait use-after-free



>Number:         57574
>Category:       kern
>Synopsis:       workqueue_wait use-after-free
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Aug 09 07:05:00 +0000 2023
>Originator:     Taylor R Campbell
>Release:        current, 10, 9, 8
>Organization:
The NetBSD Workqueue
>Environment:
>Description:
1. workqueue_enqueue puts a struct work on q_queue_pending.
2. workqueue_worker transfers a batch of work from q_queue_pending to q_queue_running, and then calls workqueue_runlist without the lock.
3. workqueue_runlist calls the workqueue function on each struct work in the queue -- and, although it doesn't _modify_ anything in q_queue_running, it must not touch the struct work after calling the function, because...
4. The workqueue function is allowed to free the structure containing the struct work.
5. workqueue_wait searches through both q_queue_pending and q_queue_running for the matching struct work.

However, the first struct work in q_queue_running may have been freed by the time workqueue_wait touches it.
>How-To-Repeat:
code inspection
>Fix:
Some options:

(a) Add lock/unlock cycles in workqueue_runlist to keep q_queue_running updated under the lock, and do something to keep a pointer to the current struct work -- not safe to dereference to find the next entry in the list, but safe to compare in workqueue_wait.  However, presumably avoiding these lock/unlock cycles on each struct work was an intentional part of the workqueue(9) implementation.

(b) Add a generation number to each queue, incremented each time workqueue_runlist finishes a batch of work, and a flag indicating whether workqueue_runlist is in progress.  workqueue_wait must 1. wait until the struct work is not in q_queue_pending, and then 2. if workqueue_runlist is in progress, wait until the generation number changes.  If workqueue_wait found the struct work in step (1), we can skip it on all other CPUs.  This might raise the cost of workqueue_wait (waiting for more work to finish than it strictly needs to), but workqueue_wait is a slow path on teardown; this avoids raising the cost in the fast path of running work.



Home | Main Index | Thread Index | Old Index