tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: a parallel operation problem about softint(9)



Hi,

On 2016/01/13 4:19, Taylor R Campbell wrote:
>    Date: Fri, 18 Dec 2015 11:09:17 +0900
>    From: Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>
> 
>    I think softint_disestablish should wait not only SOFTINT_ACTIVE
>    but also SOFTINT_PENDING flag, that is, the following patch is
>    required
> 
> I think that this may not be the correct fix, and that you may be
> masking a legitimate bug elsewhere.
> 
> In particular, I think the scenario you quoted should not happen:
> 
>     (1) CPU#X do softint_schedule() for "handler A"
>         - the softhand_t is set SOFTINT_PENDING flag
>         - the softhand_t is NOT set SOFTINT_ACTIVE flag yet
>     (2) CPU#X begin other H/W interrupt processing
>     (3) CPU#Y do softint_disestablish() for "handler A"
>         - wait until softhand_t's SOFTINT_ACTIVE of all CPUs is clear
>         - the softhand_t is set not SOFTINT_ACTIVE but SOFTINT_PENDING,
>           so CPU#Y does not wait
> 
> The reason it should not happen is that CPU#Y first does a null
> broadcast xcall to wait for *something* to happen in thread context on
> all CPUs before it even tests SOFTINT_ACTIVE.  That means all hard
> interrupt processing on CPU#X should be done, and CPU#X should have
> transitioned from SOFTINT_PENDING to SOFTINT_ACTIVE by the time
> xc_wait returns in softint_disestablish on CPU#Y.
> 
> (The only reason some CPUs might remain SOFTINT_ACTIVE in the loop in
> softint_disestablish, instead of completely finishing, is that the
> softint may sleep, e.g. on an adaptive lock.)
> 
> The caller must promise that before calling softint_disestablish,
> there will be no more calls to softint_schedule on any CPUs.  So the
> caller promises that no CPUs should newly transition to
> SOFTINT_PENDING by the time they call softint_disestablish.
> 
> What I suspect is happening is that there is a code path in gif(4) or
> ip_encap that continues to softint_schedule, which violates the
> caller's contract for softint_disestablish, and you need to find a way
> to prevent that code path.  I called this operation gif_encap_pause in
> another message.

I see.

I will fix gif(4) codes to promise the contract before gif(4) MP-ify.
And then, I will revert kern_softint:r1.42 and add
"KASSERT((flags & SOFTINT_PENDING) == 0)" after sc_wait in
softint_disestablish().


Thanks,

-- 
//////////////////////////////////////////////////////////////////////
Internet Initiative Japan Inc.

Device Engineering Section,
Core Product Development Department,
Product Division,
Technology Unit

Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>


Home | Main Index | Thread Index | Old Index