tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: workqueue_destroy() can cause hanging up in the some cases



Hi,

On 2019/09/06 12:45, Taylor R Campbell wrote:
Date: Fri, 6 Sep 2019 12:17:32 +0900 From: Kengo NAKAHARA
<k-nakahara%iij.ad.jp@localhost>

I found workqueue_destroy() for WQ_PERCPU workqueue can cause
hanging up while preempt disabled. The caller of
workqueue_destroy() requires for q_worker kthread to call
kthread_exit(). In the implementation, the caller do cv_wait()(*1)
until q_worker sets NULL to q->q_worker(*2). - (*1)
https://nxr.netbsd.org/xref/src/sys/kern/subr_workqueue.c#227 -
(*2) https://nxr.netbsd.org/xref/src/sys/kern/subr_workqueue.c#208

However, q_worker thread cannot run on the CPU which the caller of workqueue_destroy() is running, when preempt disabled. That causes hanging up.

I think it may be enough to just add notice to workqueue_destroy()
man, but it should be fixed if it can.

Do you have any comments or fix ideas?

Interesting.

1. Why are you trying to workqueue_destroy with preemption disabled?

Normally creation and destruction routines should be done in thread context without anything blocked or any resources held, because they
may generally sleep waiting for resources.

I found it when I am debugging attach processing of ixg(4) and vmx(4),
in particular, I try to call workqueue_destroy() in attach function
to check error case.
# The drivers use workqueue(9) instead of softint(9) to reduce the
# packet processing load.
The device attach processing is done before kpreempt_enable().

Yes, in this situation, we can avoid the hanging up to use
config_finalize_register(). But, I worry about there may be other
problems.

2. How do you conclude that disabling preemption makes the
difference? Do you have a minimal working example that hangs if you
disable preemption, but works if you leave it enabled?

I would expect cv_wait to sleep and let the other thread run even if
preemption is disabled.  (We should maybe also have a mechanism for
disabling preemption _and sleep_ if we don't already -- meaning sleeping would cause a panic, in a context where you want to make sure no thread switches can happen.)

Ah... I'm sorry, I am incorrect and you are correct.
As I wrote the above, I found it in autoconfig processing. The autoconfig
processing is different from just preempt disabled situation. When boot
processing has completed, workqueue_destroy() after kpreempt_disable()
works fine.

Hmm, the hanging up seems to occur while boot processing only. So, I
should just call workqueue_destroy() and workqueue_create() in
config_finalize_register()'ed function. Sorry to report wrongly.


Thanks,

--
//////////////////////////////////////////////////////////////////////
Internet Initiative Japan Inc.

Device Engineering Section,
Product Development Department,
Product Division,
Technology Unit

Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>
aut


Home | Main Index | Thread Index | Old Index