Re: passive references

To: Taylor R Campbell <riastradh%netbsd.org@localhost>
Subject: Re: passive references
From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
Date: Fri, 29 Jan 2016 17:39:13 +0900

Hi riastradh,

It's cool! It should help our work.

I'm thinking applying psref to bridge member list
that is now using its own version of similar mechanism
(psz + refcount).

To do so, some questions come to me:
- Can psref_acquire and psref_release be used in ioctl
  handlers if we use kprempet_disable together?
  - Bridge member lists can be accessed via ioctl
    that runs in normal LWP context
- Can we use pserialize_read_{enter,exit} for psref
  objects?
  - Not all critical sections of a bridge member list
    are sleepable. If we can use pserialize_read_{enter,exit}
    directly for non-sleepable critical sections, it makes
    the code simple and efficient

Thanks,
  ozaki-r

BTW psref recalls me hazard pointers (and OpenBSD's SRP).

On Mon, Jan 25, 2016 at 4:10 AM, Taylor R Campbell <riastradh%netbsd.org@localhost> wrote:
> To make the network stack scale well to multiple cores, the packet-
> processing path needs to share resources such as routes, tunnels,
> pcbs, &c., between cores without incurring much interprocessor
> synchronization.
>
> It would be nice to use pserialize(9) for this, but many of these
> resources are held by code paths during packet processing that may
> sleep, which is not allowed in a pserialize read section.  The two
> obvious ways to resolve this are:
>
> - Change all of these paths so that they don't sleep and can be run
>   inside a pserialize read section.  This is a major engineering
>   effort, because the network stack is such a complex interdependent
>   beast.
>
> - Add a reference count to each route, tunnel, pcb, &c.  This would
>   work to make the network stack *safe* to run on multiple cores, but
>   it incurs interprocessor synchronization for each use and hence
>   fails to make the network stack *scalable* to multiple cores.
>
> Prompted by discussion with rmind@ and dyoung@, I threw together a
> sketch for an abstraction rmind called `passive references' which can
> be held across sleeps on a single CPU -- e.g., in a softint LWP or
> CPU-bound kthread -- but which incur no interprocessor synchronization
> to acquire and release.  This would serve as an intermediary between
> the two options so that we can incrementally adapt the network stack.
>
> The idea is that acquiring a reference puts an entry on a CPU-local
> list, which can be done inside a pserialize read section.  Releasing
> the reference removes the entry.  When an object is about to be
> destroyed -- e.g., you are unconfiguring a tunnel -- then you mark it
> as unusable so that nobody can acquire new references, and wait until
> there are no references on any CPU's list.
>
> The attached file contains a summary of the design, an example of use,
> and a sketch of an implementation, with input and proof-reading from
> riz@.
>
> Thoughts?
>
>
> A variant of this approach which dyoung@ has used in the past is to
> count the number of references, instead of putting them on a list, on
> each CPU.  I first wrote a sketch with a count instead of a list,
> thinking mainly of using this just for ip_encap tunnels, of which
> there are likely relatively few, and not for routes or pcbs.
>
> However, if there are many more objects than references -- as I expect
> to be with most kinds of packet flow that the packet-processing path
> will handle one or two of at a time --, it would waste a lot of space
> to have one count on each CPU for each object, yet the list of all
> references on each CPU (to any object) would be relatively short.

Follow-Ups:
- Re: passive references
  - From: Taylor R Campbell

References:
- passive references
  - From: Taylor R Campbell

Prev by Date: Re: RFC: softint-based if_input
Next by Date: Re: RFC: softint-based if_input
Previous by Thread: passive references
Next by Thread: Re: passive references
Indexes:

Home | Main Index | Thread Index | Old Index