tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Introducing localcount(9)



On Thu, 11 May 2017, Taylor R Campbell wrote:

   (1) Why splsoftserial() is required instead of kpreempt_disable()?
       localcount_drain() uses xc_broadcast(0, ...), that is, it uses
       low priority xcall. Low priority xcall would be done by kthread
       context, so I think kpreempt_disable() would be sufficient to
       prevent localcount_drain() xcall running.

I think you are correct.  Taylor, do you agree?

Yes, I think this is fine.  I probably chose splsoftserial because I
was thinking of pserialize(9).

Committed.

   (2) Why both "localcount_adjust(lc, -1)" and "*lc->lc_totalp -= 1" are
       necessary in "lc->lc_totalp != NULL" case?
       I am afraid of below situation
           [1] CPU#A: mutex_enter(interlock)
           [2] CPU#A: call localcount_drain()
           [3] CPU#A: mutex_exit(interlock) before xc_broadcast
           [4] CPU#B: call localcount_release()
           [5] CPU#B: mutex_enter(interlock)
           [6] CPU#B: localcount_adjust(lc, -1)
           [7] CPU#B: *lc->lc_totalp -= 1
           [8] CPU#B: done localcount_release()
           [9] CPU#A: call xc_broadcast()
           [10]CPU#B: begin localcount_xc() and done
           [11]CPU#A: resume localcount_drain()
           [12]CPU#A: mutex_enter(interlock)
       Can "lc->lc_totalp" be -1 at this point?

localcount_adjust() updates the local CPU's reference counter, while
*lc->lc_totalp is the global sum that we are accumulating.  We want to
keep them synchronized.

What knakahara is worried about -- rightly, I think -- is the
difference between the following two scenarios:

1. CPU A: localcount_drain      (total=0, a=0, b=1)
  CPU B: localcount_xc         (total=1, a=0, b=1)
  CPU B: localcount_release    (total=0, a=0, b=0)

2. CPU A: localcount_drain      (total=0, a=0, b=1)
  CPU B: localcount_release    (total=-1, a=0, b=0)
  CPU B: localcount_xc         (total=-1, a=0, b=0)

If localcount_xc has already run on the local CPU, then we do want to
keep them in sync in localcount_release, which is why I wrote the
logic I did.  But if it has not yet run, we don't want to double-count
the release in the global and the local counts, which is what
knakahara observed.

Ah, yes, got it!

The problem is that in localcount_release, we don't know whether the
total includes the local count yet, and thus whether when subtracting
one from the local count we also need to subtract one from the total
count.

One possible way to remedy this -- haven't thought this through yet,
obviously needs more attention -- is to change localcount_xc to
*transfer* rather than add the local references,

       mutex_enter(interlock);
       localp = percpu_getref(lc->lc_percpu);
       *lc->lc_totalp += *localp;
       *localp -= *localp;   /* i.e., *localp = 0 */
       percpu_putref(lc->lc_percpu);
       mutex_exit(interlock);

and to change localcount_release to decrement only the total count
when it is available:

       mutex_enter(interlock);
       if (--*lc->lc_totalp == 0)
               cv_broadcast(cv);
       mutex_exit(interlock);

Then the scenarios become:

1'. CPU A: localcount_drain     (total=0, a=0, b=1)
   CPU B: localcount_xc        (total=1, a=0, b=0)
   CPU B: localcount_release   (total=0, a=0, b=0)

2'. CPU A: localcount_drain     (total=0, a=0, b=1)
   CPU B: localcount_release   (total=-1, a=0, b=1)
   CPU B: localcount_xc        (total=0, a=0, b=0)

Makes sense. I've made appropriate changes, and will rebuild and re-test.


+------------------+--------------------------+----------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:          |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+------------------+--------------------------+----------------------------+


Home | Main Index | Thread Index | Old Index