tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/41923: assertion "cur != owner" failed



Hi,
I may have found the cause of this problem:
assertion "cur != owner" failed: file "src/sys/kern/kern_turnstile.c", line 289

In turnstile_block(): we couldn't get a rwlock, so we're about to
sleep. When entering the for(;;) loop we hold the tschain_t mutex, which
is also "cur" lwp's lock since sleepq_enter(). At this point we also have
l == cur.
We have a ownder from (*l->l_syncobj->sobj_owner)(l->l_wchan) whose lock
is not the same as l->l_mutex. So we enter the dolock case. Later, we'll do
a lwp_unlock(l) and as we are "cur", we release the tschain_t mutex.
The owner in rw_vector_exit() can now grab the turnstile lock and
change the owner, eventually to the one in turnstile_block().
The thread in turnstile_block() continue the for();; loop,
and hit the KASSERT(cur != owner). Bom.

If you're lucky the thread in turnstile_block() won't notice the owner change
and process to sleepq_block(). I wonder if this could explain the
"processes stuck on tstile" problems peoples have been noticing,
by corruption of the l_pi_lenders queue, or something like that
In turnstile_wakeup(), we grab a lock only if the lwp's lock is
ci_schedstate.spc_lwplock - what protects the queue in other cases ?
It seems that in turnstile_block() it's always protected by the
lwp's lock.

To me it looks like we should never release cur->l_mutex in turnstile_block.
Any comments on this ?

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index