NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/59411 (deadlock on mbuf pool)



On Mon, May 19, 2025 at 01:00:29PM +0000, Taylor R Campbell wrote:
> > Date: Mon, 19 May 2025 10:17:57 +0200
> > From: Manuel Bouyer <bouyer%antioche.eu.org@localhost>
> > 
> > Back to the initial patch in this PR.
> > 
> > I guess running the whole pool_grow() at splvm(), including the busy_wait
> > on PR_GROWINGNOWAIT would work (then we could busy-wait even when called from
> > interrupt context), but I'm not sure it's better than my initial patch.
> 
> Unfortunately, this won't work because mutex_exit will restore spl to
> what it was at mutex_enter (unless it was already raised in
> mutex_enter by holding another spin lock), even if you try something
> like:

Yes, it would need something like:
mutex__exit(&pp->pr_lock)
s = splraiseipl(pp->pr_ipl);
mutex_enter(&pp->pr_lock)

at the entry of pool_grow(), and the opposite at the exit.

> [...]
> I think your initial patch (assuming you mean 1.293, and then just
> deleting the whole PR_GROWINGNOWAIT machinery)

No, I mean the patch in the first mail in the PR, which skips
pr_drain_hook in pool_allocator_alloc() in the !PR_WAITOK case.

> is likely to be asking
> for trouble by continuing to hold the lock across the backing
> allocator -- and, perhaps worse, across the backing allocator's free
> routine, which sometimes issues a cross-call that requires all other
> CPUs to be responsive.

In the actual code, PR_GROWINGNOWAIT isn't doing anything since rev 1.220,
2017/12/29. NetBSD 8.0_RELEASE did include this change.

rev 1.293 changed a branch that is, AFAIK, never taken, to a KASSERT().

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index