tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: a parallel operation problem about softint(9)



   Date: Fri, 18 Dec 2015 11:09:17 +0900
   From: Kengo NAKAHARA <k-nakahara%iij.ad.jp@localhost>

   I think softint_disestablish should wait not only SOFTINT_ACTIVE
   but also SOFTINT_PENDING flag, that is, the following patch is
   required

I think that this may not be the correct fix, and that you may be
masking a legitimate bug elsewhere.

In particular, I think the scenario you quoted should not happen:

    (1) CPU#X do softint_schedule() for "handler A"
        - the softhand_t is set SOFTINT_PENDING flag
        - the softhand_t is NOT set SOFTINT_ACTIVE flag yet
    (2) CPU#X begin other H/W interrupt processing
    (3) CPU#Y do softint_disestablish() for "handler A"
        - wait until softhand_t's SOFTINT_ACTIVE of all CPUs is clear
        - the softhand_t is set not SOFTINT_ACTIVE but SOFTINT_PENDING,
          so CPU#Y does not wait

The reason it should not happen is that CPU#Y first does a null
broadcast xcall to wait for *something* to happen in thread context on
all CPUs before it even tests SOFTINT_ACTIVE.  That means all hard
interrupt processing on CPU#X should be done, and CPU#X should have
transitioned from SOFTINT_PENDING to SOFTINT_ACTIVE by the time
xc_wait returns in softint_disestablish on CPU#Y.

(The only reason some CPUs might remain SOFTINT_ACTIVE in the loop in
softint_disestablish, instead of completely finishing, is that the
softint may sleep, e.g. on an adaptive lock.)

The caller must promise that before calling softint_disestablish,
there will be no more calls to softint_schedule on any CPUs.  So the
caller promises that no CPUs should newly transition to
SOFTINT_PENDING by the time they call softint_disestablish.

What I suspect is happening is that there is a code path in gif(4) or
ip_encap that continues to softint_schedule, which violates the
caller's contract for softint_disestablish, and you need to find a way
to prevent that code path.  I called this operation gif_encap_pause in
another message.


Home | Main Index | Thread Index | Old Index