tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: bug in softint_execute() ?



On 10-Apr-2008 Andrew Doran wrote:
>> It's a bound kthread and should never migrate to another CPU. Are you using
>> SCHED_M2? Having read the code recently I don't know how that could happen.
>> Could you add KASSERTs on entry and exit that check for LW_BOUND set in
>> l_flag? It may also be a good idea to add assertions that check curlwp is
>> set correctly (IIRC, matches si->si_lwp).
> 
> Looking at the trace above, it occurred to me that softint_overlay() (part
> of the slow path code) is hijacking a user LWP and so it's very unlikely
> to be bound to a CPU, let alone bound to the correct one.
> 
> It's not possible to simply OR the LW_BOUND flag into l_flag because that
> would require locking curlwp twice on every soft interrupt, which is too
> expensive. I think we could move the bound flag into l_pflag, the "thread
> private" flag word. LW_BOUND is only modified or inspected by curlwp, or
> when the LWP is known to be in a quiescent state, eg being created or awoken
> from sleep. So there is no danger of modifications being lost / out of sync.

I am not using SCHED_M2, and I tried Matt's patch, but it didn't help.  This
panic is incredibly easy to trigger, so any patch you might come up with, I can
test with reasonable reliability.

Do you still want me to throw those KASSERTS in?  It sounds like not.

Thanks for looking at this.

---
Tim Rightnour <root%garbled.net@localhost>
NetBSD: Free multi-architecture OS http://www.netbsd.org/
Genecys: Open Source 3D MMORPG: http://www.genecys.org/


Home | Main Index | Thread Index | Old Index