tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: PERCPU_LIST to fix PR kern/52515



> Date: Fri, 6 Oct 2017 11:26:40 +0900
> From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
> 
> On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell
> <campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
> > Quick summary of the problem:
> >
> > Possible solutions.  I'm leaning toward (6), to open-code the linked
> > list operations for this special purpose, with compile-tested patch
> > attached.  This changes the text of psref.h, but shouldn't change the
> > ABI.  Comments?
> 
> How about using SLIST instead of open-coding? The instructions of them
> are very similar, but the SLIST version is slightly simpler.

I avoided that because n psref_release operations takes expected and
worst-case O(n^2) time and there's no constant bound on the latency of
a single psref_release operation.  But maybe n is always small enough
that it doesn't matter -- perhaps enough that the concrete cost of
maintaining a doubly-linked list is higher.

(My desire to avoid thinking about bounds on n is also what motivated
me to use a linked list instead of an array in the first place.)

Note that your patch changes the ABI of struct psref!

I wonder whether the open-coded version would do better if it
unconditionally loaded the percpu:

	pcpu = percpu_getref(class->prc_percpu);
	KASSERTMSG(psref->psref_prevp == NULL || *psref->psref_prevp == psref,
	    "psref %p prevp %p points at %p",
	    psref, psref->psref_prevp, *psref->psref_prevp);
	KASSERTMSG(psref->psref_prevp != NULL || pcpu->pcpu_first == psref,
	    "psref %p marked as first but psref_cpu %p on %d first is %p",
	    psref, pcpu, cpu_index(curcpu()), pcpu->pcpu_first);
	*(psref->psref_prevp ? psref->psref_prevp : &pcpu->pcpu_first) =
	    psref->psref_next;
	percpu_putref(class->prc_percpu);

With DIAGNOSTIC disabled, I get a conditional move instruction instead
of branches this way:

 4f9:   e8 00 00 00 00          callq  4fe <psref_release+0x93>
                        4fa: R_X86_64_PC32     percpu_getref+0xfffffffffffffffc
 4fe:   48 8b 53 08             mov    0x8(%rbx),%rdx
 502:   48 85 d2                test   %rdx,%rdx
 505:   48 0f 44 d0             cmove  %rax,%rdx
 509:   48 8b 03                mov    (%rbx),%rax
 50c:   48 89 02                mov    %rax,(%rdx)
 50f:   49 8b 7c 24 20          mov    0x20(%r12),%rdi
 514:   e8 00 00 00 00          callq  519 <psref_release+0xae>
                        515: R_X86_64_PC32     percpu_putref+0xfffffffffffffffc

Also, my original patch was missing a percpu_putref.  I hope you put
it back before you ran your test!


Home | Main Index | Thread Index | Old Index