tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: struct ifnet and ifaddr handling [was: Re: Making global variables of if.c MPSAFE]

On Thu, Nov 13, 2014 at 04:26:52AM +0000, Taylor R Campbell wrote:
>    Date: Thu, 13 Nov 2014 12:43:26 +0900
>    From: Ryota Ozaki <>
>    Here is a new patch:
>    I think the patch reflects rmind's suggestions:
>    - Use pserialize for IFNET_FOREACH
>      - but use a lock for blockable/sleepable critical sections
>    - cpu_intr_p workaround for HW interrupt
>    Any comments?
> Hmm...some quick notes from a non-expert in sys/net:
> - You call malloc(M_WAITOK) while the ifnet lock is held, in
>   if_alloc_sadl_locked, which is not allowed.
> - You call copyout in a pserialize read section, in ifconf, which is
>   not allowed because copyout may block.
> - I don't know what cpu_intr_p is working around but it's probably not
>   a good idea!
> Generally, all that you are allowed to do in a pserialize read section
> is read a small piece of information, or grab a reference to a data
> structure which you are then going to use outside the read section.
> I don't think it's going to be easy to scalably parallelize this code
> without restructuring it, unless as a stop-gap you use a heaver-weight
> reader-writer lock like the prwlock at
> <>.
> (No idea how much overhead this might add.)

Parallelizing the network code without restructuring it?  That
sounds like something I have tried before.  Avoid it, if you can!

When I was confronted at a previous job with the problem of rapidly
MP-ifying the network stack, I introduced a lightweight reader/writer
lock for the network configuration.  I called that lock the "corral."
A thread had to enter the corral before it read or modified a route,
ifnet, ifaddr, etc.  There could be multiple readers in the corral,
or one writer in the corral, never a reader and writer at once---i.e.,
usual reader/writer semantics.  The corral was designed so that
readers could enter and exit very quickly without using any locked
instructions or modifying any shared cachelines in the common case.
It was fairly expensive for a writer to enter the corral, or for
a reader to wait for a writer to exit before it entered.

The corral was introduced gradually, the kernel_lock and softnet_lock
gradually phased out.  In retrospect, I probably introduced the
corral_enter() calls in a different order with KERNEL_LOCK() and
mutex_enter(softnet_lock) calls than I should have, and that caused
me a lot of grief.  It did not help that the kernel did not make
KERNEL_LOCK() and mutex_enter(softnet_lock) calls in a consistent

IIRC, some of the waits in the corral entailed cv_wait() and other
blocking calls that could not be made from an interrupt context.
Anyway, I ended up in the end running virtually all packet processing
in a LWP context so that I could use whichever synchronization
objects I liked.  NIC interrupts were just responsible for waking
per-CPU threads that processed the Rx rings.  This approach can
work ok if your starting point is the inefficient legacy NetBSD
packet processing and there are a preponderance of small packets.
Matt Thomas once explained to me some serious overheads that crop
up when the inter-packet interval is long and your packet processing
is tuned up, and I might make different trade-offs between
interrupt/LWP processing with 20/20 hindsight.

Anyway, while I worked on the rapid MP-ification project, I had on
my desk an eminent engineer's thoroughgoing and ambitious plan for
MP-ifying a *BSD network stack: a rational MP-ification project.
I don't think that my team could have implemented that plan on a
timescale that met the business need, but if you take a global
perspective---how many times will persons and projects around the
world will be hobbled by NetBSD's legacy network stack before it
is finally restructured?---then it's pretty clear that now is the
time to begin a rational project.


David Young    Urbana, IL    (217) 721-9981

Home | Main Index | Thread Index | Old Index