tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: PERCPU_LIST to fix PR kern/52515



On Fri, Oct 6, 2017 at 1:14 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
> On Fri, Oct 6, 2017 at 11:58 AM, Taylor R Campbell
> <campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
>>> Date: Fri, 6 Oct 2017 11:26:40 +0900
>>> From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
>>>
>>> On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
>>> <campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
>>> > Quick summary of the problem:
>>> >
>>> > Possible solutions.  I'm leaning toward (6), to open-code the linked
>>> > list operations for this special purpose, with compile-tested patch
>>> > attached.  This changes the text of psref.h, but shouldn't change the
>>> > ABI.  Comments?
>>>
>>> How about using SLIST instead of open-coding? The instructions of them
>>> are very similar, but the SLIST version is slightly simpler.
>>
>> I avoided that because n psref_release operations takes expected and
>> worst-case O(n^2) time and there's no constant bound on the latency of
>> a single psref_release operation.  But maybe n is always small enough
>> that it doesn't matter -- perhaps enough that the concrete cost of
>> maintaining a doubly-linked list is higher.
>
> I also suppose that a target being released is at the head of the list
> because targets of psref are typically manipulated in last-in-first-out
> manner. But yes psref_release of SLIST version can be O(n) in worst
> cases theoretically.
>
> Well, when I think this kind of problems I tend to think of our usages,
> i.e., network processing as a router. In such usages, psref is used in
> softint and the last-in-first-out manner would keep in many cases. But
> in other usages such as uses in threads the assumption is perhaps not
> really.
>
>>
>> (My desire to avoid thinking about bounds on n is also what motivated
>> me to use a linked list instead of an array in the first place.)
>>
>> Note that your patch changes the ABI of struct psref!
>
> Let's bump the kernel version!
>
>>
>> I wonder whether the open-coded version would do better if it
>> unconditionally loaded the percpu:
>>
>>         pcpu = percpu_getref(class->prc_percpu);
>>         KASSERTMSG(psref->psref_prevp == NULL || *psref->psref_prevp == psref,
>>             "psref %p prevp %p points at %p",
>>             psref, psref->psref_prevp, *psref->psref_prevp);
>>         KASSERTMSG(psref->psref_prevp != NULL || pcpu->pcpu_first == psref,
>>             "psref %p marked as first but psref_cpu %p on %d first is %p",
>>             psref, pcpu, cpu_index(curcpu()), pcpu->pcpu_first);
>>         *(psref->psref_prevp ? psref->psref_prevp : &pcpu->pcpu_first) =
>>             psref->psref_next;
>>         percpu_putref(class->prc_percpu);
>>
>> With DIAGNOSTIC disabled, I get a conditional move instruction instead
>> of branches this way:
>>
>>  4f9:   e8 00 00 00 00          callq  4fe <psref_release+0x93>
>>                         4fa: R_X86_64_PC32     percpu_getref+0xfffffffffffffffc
>>  4fe:   48 8b 53 08             mov    0x8(%rbx),%rdx
>>  502:   48 85 d2                test   %rdx,%rdx
>>  505:   48 0f 44 d0             cmove  %rax,%rdx
>>  509:   48 8b 03                mov    (%rbx),%rax
>>  50c:   48 89 02                mov    %rax,(%rdx)
>>  50f:   49 8b 7c 24 20          mov    0x20(%r12),%rdi
>>  514:   e8 00 00 00 00          callq  519 <psref_release+0xae>
>>                         515: R_X86_64_PC32     percpu_putref+0xfffffffffffffffc
>>
>> Also, my original patch was missing a percpu_putref.  I hope you put
>> it back before you ran your test!
>
> I'll test with the patch in the other mail later.

The above results were actually measured by knakahara@ and
this time I've measured by myself (the measurement hardware
is the same and the kernel configs of his and mine should
be almost the same though).

...so the results show a slightly different trend:
  - original:  137.9 138.0 (144.7) Mbps
  - open-code: 135.2 134.6 (138.5) Mbps
  - SLIST:     140.7 141.4 (140.1) Mbps

I think knakahara@ (or someone else) needs to re-evaluate :-/

  ozaki-r


Home | Main Index | Thread Index | Old Index