tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: RFC: PERCPU_LIST to fix PR kern/52515
On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
<campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
> Quick summary of the problem:
>
> percpu_alloc(9) may move existing per-CPU objects to different
> locations in memory. This means you can't reliably store pointers
> to the per-CPU objects anywhere.
>
> psref(9) currently uses queue.h LIST, with a per-CPU list head --
> but LIST stores back-pointers into the list head, which breaks when
> percpu_alloc(9) moves the list head.
>
> Possible solutions. I'm leaning toward (6), to open-code the linked
> list operations for this special purpose, with compile-tested patch
> attached. This changes the text of psref.h, but shouldn't change the
> ABI. Comments?
How about using SLIST instead of open-coding? The instructions of them
are very similar, but the SLIST version is slightly simpler.
This is a patch: http://www.netbsd.org/~ozaki-r/psref-SLIST.diff
One worry of the implementation is that percpu_{getref,putref} need to
be always called in psref_release while the open-coded one doesn't need
it. However the result of performance evaluations shows that the overhead
is not so big and moreover SLIST version overtakes open-coded one slightly.
These are the results of IP forwarding performance evaluations
with 64 byte frames:
- original: 144.7 Mbps
- open-code: 138.5 Mbps
- SLIST: 140.1 Mbps
ozaki-r
>
>
> 1. Make psref(9) store a pointer to a separately kmem_alloc-allocated
> list head. (This is what Nakahara-san considered but rejected.)
>
> struct psref_cpu {
> - struct psref_head pcpu_head;
> + struct psref_head *pcpu_head;
> };
>
> Problem: We need to initialize these before psref_acquire/release.
> We could initialize it in psref_class_create, but for various
> applications of psref, we may not know how many CPUs there are at
> that time.
>
> To fix this properly we need to teach percpu(9) how to initialize
> and finalize per-CPU objects as CPUs come online, which as a bonus
> would make it work with dynamic CPU attachment and detachment. And
> that's a lot of work.
>
> 2. Make percpu(9) always store pointers into separately kmem_alloc-
> allocated memory, like userland TLS and pthread keys do.
>
> Problem: It's a bit of work to basically rewrite subr_percpu.c, and
> may have performance implications. I started last night, spent an
> hour rewriting it, and didn't finish before bedtime came.
>
> 3. Invent a new variant of LIST called PERCPU_LIST that doesn't
> require back-pointers into the head. (This is what Nakahara-san
> suggested before I vanished into a EuroBSDcon vortex.)
>
> Problem: This is almost the same as LIST -- supports essentially all
> the same operations with the same performance characteristics --
> and has very limited utility. So unless we find other applications
> that need exactly this too, I'm reluctant to expose it as a
> general-purpose kernel API right now.
>
> It can't be a drop-in replacement for LIST, because we have
> LIST_REMOVE(obj, field) but we need MHLIST_REMOVE(head, obj, field)
> in order to allow the head to move.
>
> And we can't just add LIST_REMOVE_MAYBEHEAD(head, obj, field) or
> something to LIST, because there's no way to distinguish with
> standard LIST whether an object is at the head of the list or not,
> without knowing where the head *was* in memory when the object (or
> one of its predecessors) was inserted.
>
> 4. Teach the percpu(9) API to move objects with an operation other
> than memcpy. This would look like
>
> percpu_t *percpu_alloc(size_t);
> + percpu_t *percpu_alloc_movable(size_t,
> + void *(*)(void *, const void *, size_t));
>
> with percpu_alloc(sz) := percpu_alloc_movable(sz, &memcpy). Then
> for psref(9), we could use LIST_MOVE to migrate the old list head
> to the new list head and fix the one back-pointer that would have
> been invalidated.
>
> Problem: This is again an operation of limited utility, since most
> per-CPU objects (like counters) don't need special operations to
> move in memory. And it would compete with extending the API for
> dynamic CPU initialization/finalization operations, so I'm not sure
> I want to preemptively step on the toes of that API design space at
> the moment.
>
> 5. Violate the abstraction of LIST in psref(9), as I proposed as a
> stop-gap measure in
> <https://mail-index.netbsd.org/tech-kern/2017/09/19/msg022345.html>.
>
> Problem: It's gross, makes maintenance of the LIST implementation
> more difficult, makes understanding the psref(9) code more
> difficult, breaks QUEUEDEBUG, &c.
>
> 6. Just open-code it in subr_psref.c. If we ever find more
> applications of this idiom, we can always upgrade it without
> trouble to an explicit PERCPU_LIST abstraction.
>
> Problem: We don't get to share any common LIST code. This seems
> like a small price for a single-use gizmo like this.
Home |
Main Index |
Thread Index |
Old Index