tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Deferred if_start



On Fri, Dec 2, 2016 at 7:55 PM, Joerg Sonnenberger <joerg%bec.de@localhost> wrote:
> On Fri, Dec 02, 2016 at 06:46:12PM +0900, Ryota Ozaki wrote:
>> On Fri, Dec 2, 2016 at 6:32 PM, Joerg Sonnenberger <joerg%bec.de@localhost> wrote:
>> > On Fri, Dec 02, 2016 at 04:07:24PM +0900, Ryota Ozaki wrote:
>> >> We need it for MP-ification of bpf because some drivers call
>> >> if_start (or equivalent, say drvX_start) from hardware
>> >> interrupt, which means that bpf_mtap is also called from
>> >> hardware interrupt. That makes MP-ification hard.
>> >
>> > What is the reason for bpf_mtap to fail in hardware interrupt context?
>>
>> Because we can use only spin mutex.
>>
>> > I really don't like this requirement change.
>>
>> Why?
>
> Especially on older systems, adding non-trivial overhead to RX
> processing can be costly, especially if the technical reasons are
> questionable. So let's look at those reasons again.

I think it's trivial because deferred if_start only runs if it's
requested and the check if it's requested is executed per multiple
packets, not per packet.

Actually the performance down of forwarding throughput is 2 or 3 %.
The overhead mainly comes from preempt_{enable,disable} of percpu(9)
which aren't needed for hardint/softint and if we get rid of them
(i.e., use percpu_getptr_remote directly), the overhead becomes
negligible.

>
> For the BPF hooks to work, two things are necessary: read-only access to
> the filter program and read-write engine to the BPF buffer space. The
> latter is fine with a spin-lock, so that should not be a problem. That
> leaves the filter program. Hardware interrupt context is not allowed to
> sleep at all, so BPF access from this context can only see a filter
> program while it is currently running. So requirements for safe access
> should therefore be:
> (1) Atomic loading the new BPF program.
> (2) Delayed freeing of the old BPF program.
> (3) Access to BPF program must block some synchronisation cross call.
> (4) After the cross call got ACKed by all CPUs, the old program can be
> removed.
>
> I'm not sure if passive references already provides this, but this
> approach should work without any further changes to network drivers.

psz/psref + mutex can do that. I guess using psz/psref + mutex also
doesn't require any changes to network drivers.

>
> Side note: Please don't understand this as objection against moving the
> if_start processing to softint context. That's a different topic and
> decoupling if_start from the queue processing by deferring it into a
> softint / workqueue / whatever seems perfectly reasonable to me.

BTW, how about moving bpf_mtap (and ifp->if_ipackets++) from each driver
to the common percpuq if_input? That's also a requirement for MP-safe bpf.

Thanks,
 ozaki-r



Home | Main Index | Thread Index | Old Index