tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: PERCPU_LIST to fix PR kern/52515



On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
<campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
> Quick summary of the problem:
>
>    percpu_alloc(9) may move existing per-CPU objects to different
>    locations in memory.  This means you can't reliably store pointers
>    to the per-CPU objects anywhere.
>
>    psref(9) currently uses queue.h LIST, with a per-CPU list head --
>    but LIST stores back-pointers into the list head, which breaks when
>    percpu_alloc(9) moves the list head.
>
> Possible solutions.  I'm leaning toward (6), to open-code the linked
> list operations for this special purpose, with compile-tested patch
> attached.  This changes the text of psref.h, but shouldn't change the
> ABI.  Comments?

How about using SLIST instead of open-coding? The instructions of them
are very similar, but the SLIST version is slightly simpler.

This is a patch: http://www.netbsd.org/~ozaki-r/psref-SLIST.diff

One worry of the implementation is that percpu_{getref,putref} need to
be always called in psref_release while the open-coded one doesn't need
it. However the result of performance evaluations shows that the overhead
is not so big and moreover SLIST version overtakes open-coded one slightly.

These are the results of IP forwarding performance evaluations
with 64 byte frames:
  - original:  144.7 Mbps
  - open-code: 138.5 Mbps
  - SLIST:     140.1 Mbps

  ozaki-r

>
>
> 1. Make psref(9) store a pointer to a separately kmem_alloc-allocated
>    list head.  (This is what Nakahara-san considered but rejected.)
>
>         struct psref_cpu {
> -               struct psref_head       pcpu_head;
> +               struct psref_head       *pcpu_head;
>         };
>
>   Problem: We need to initialize these before psref_acquire/release.
>    We could initialize it in psref_class_create, but for various
>    applications of psref, we may not know how many CPUs there are at
>    that time.
>
>    To fix this properly we need to teach percpu(9) how to initialize
>    and finalize per-CPU objects as CPUs come online, which as a bonus
>    would make it work with dynamic CPU attachment and detachment.  And
>    that's a lot of work.
>
> 2. Make percpu(9) always store pointers into separately kmem_alloc-
>    allocated memory, like userland TLS and pthread keys do.
>
>   Problem: It's a bit of work to basically rewrite subr_percpu.c, and
>    may have performance implications.  I started last night, spent an
>    hour rewriting it, and didn't finish before bedtime came.
>
> 3. Invent a new variant of LIST called PERCPU_LIST that doesn't
>    require back-pointers into the head.  (This is what Nakahara-san
>    suggested before I vanished into a EuroBSDcon vortex.)
>
>   Problem: This is almost the same as LIST -- supports essentially all
>    the same operations with the same performance characteristics --
>    and has very limited utility.  So unless we find other applications
>    that need exactly this too, I'm reluctant to expose it as a
>    general-purpose kernel API right now.
>
>    It can't be a drop-in replacement for LIST, because we have
>    LIST_REMOVE(obj, field) but we need MHLIST_REMOVE(head, obj, field)
>    in order to allow the head to move.
>
>    And we can't just add LIST_REMOVE_MAYBEHEAD(head, obj, field) or
>    something to LIST, because there's no way to distinguish with
>    standard LIST whether an object is at the head of the list or not,
>    without knowing where the head *was* in memory when the object (or
>    one of its predecessors) was inserted.
>
> 4. Teach the percpu(9) API to move objects with an operation other
>    than memcpy.  This would look like
>
>         percpu_t *percpu_alloc(size_t);
> +       percpu_t *percpu_alloc_movable(size_t,
> +           void *(*)(void *, const void *, size_t));
>
>    with percpu_alloc(sz) := percpu_alloc_movable(sz, &memcpy).  Then
>    for psref(9), we could use LIST_MOVE to migrate the old list head
>    to the new list head and fix the one back-pointer that would have
>    been invalidated.
>
>   Problem: This is again an operation of limited utility, since most
>    per-CPU objects (like counters) don't need special operations to
>    move in memory.  And it would compete with extending the API for
>    dynamic CPU initialization/finalization operations, so I'm not sure
>    I want to preemptively step on the toes of that API design space at
>    the moment.
>
> 5. Violate the abstraction of LIST in psref(9), as I proposed as a
>    stop-gap measure in
>    <https://mail-index.netbsd.org/tech-kern/2017/09/19/msg022345.html>.
>
>   Problem: It's gross, makes maintenance of the LIST implementation
>    more difficult, makes understanding the psref(9) code more
>    difficult, breaks QUEUEDEBUG, &c.
>
> 6. Just open-code it in subr_psref.c.  If we ever find more
>    applications of this idiom, we can always upgrade it without
>    trouble to an explicit PERCPU_LIST abstraction.
>
>   Problem: We don't get to share any common LIST code.  This seems
>    like a small price for a single-use gizmo like this.


Home | Main Index | Thread Index | Old Index