tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: BPF memstore and bpf_validate_ext()

Alexander Nasonov <> wrote:
> Mindaugas Rasiukevicius wrote:
> > Now that the BPF memory store can be external and can be provided by the
> > caller of the BPF program, we can use to it pass some values or reuse it
> > as a cache.
> I don't think that external memory is needed. You added an auxiliary
> agrument, what is it for if it's not for passing some values?

It is an argument/context to be passed for the BPF coprocessors (only).
We want to let the *caller* of bpf_filter() (or JIT-compiled BPF program)
pass some values to the BPF *program* using the memory store.  It allows
us to re-use the memstore as a cache.

> External memory disables several optimizations in bpfjit for most
> filter programs even if they don't use BPF_COP.
> 1. sljit has a limited number of registers especially memory addressable
>    registers. I will have to allocate a register to a memstore pointer.
>    Because all registers are already assigned, I will have to start
>    moving data. The best I can think of is assigning BPFJIT_TMP2 to a
>    new pointer but I will need to switch to EREG register in 32bit
>    BPF_LD. This register is emulated on some arches, for instance on
>    i386.

Isn't it only i386?  On amd64 or RISC archs it would mean just an extra
fetch, right?  I am not sure what is (or would be) the current register
allocation in the latest bpfjit snapshot.  Still, having memstore as a
cache ought to win way more compared to the extra fetches or push/pops.

> 2. I have (or plan to have) some simple optimizations which aren't
>    possible by rewriting a bpf program. For instance, if there are
>    multiple loads in a linear block, I will only generate one index
>    check.
>    Here is an example: LD+W(0) ST(0) LD+W(4) ST(1) LD+W(8) ST(2) ...
>    It loads packet words with increasing offsets and stores them
>    in memwords for later processing. I generate only only one index
>    check for the first LD instruction (return 0 is packet is shorter
>    than 12 bytes). I can exit early without storing any word because
>    I know that memory will not be available after that bpf program
>    returns.
>    If you make it external, I will have to generate three index
>    checks to make sure that stores are visible to a caller for all
>    possible packet lengths.

Yes, but as we already discussed, for tcpdump/pcap filtering patterns
such optimisation is of little relevance.  Alternatively, we may add a
flag to indicate whether the caller cares about the memstore once BPF
program finishes (because normally - it does not).

> > Hence, here is the proposed API function for this:
> > 
> > int bpf_set_initwords(bpf_ctx_t *bc, uint8_t *wordmask, size_t len);
> Your description of the new API is too terse to be absolutely
> certain but it doesn't look like my proposal I sent to you privately.

The bitmask merely tells bpf_validate() which words of the memstore will
be initialised by the caller, so it could allow the BPF_LD calls on them
i.e. it would basically do invalid = ~wordmask.

> I think you can add a mask to each copfunc which indicates which
> of the words it loads and which words are stored by the copfunc.
> For instance, your npf_cop_l3 will look like
> { .copfunc = &npf_cop_l3,
>   .loads  = BPF_COP_LOAD_NONE,
>   .stores = BPF_COP_STORE(BPF_MW_IPVER) |
>             BPF_COP_STORE(BPF_MW_L4OFF) |
> },
> /* ... other copfuncs ... */
> Validation of BPF_COP instruction will look very natural:
> /* inside BPF_COP in bpf_validate_ext */
> if (bc->copfuncs) {
>       if (bc->copfuncs[i].inwords & invalid)
>               goto out;
>       invalid &= ~bc->copfuncs[i].outwords;
> }

Right, but this is for the coprocessors.  I am fine with it (as we talked
privately, I was considering to propose a very similar API), but it seems
that bpf_set_initwords() already *efficiently* covers all my cases.  With
this change, I will actually be able to remove NPF_COP_L3!

Generally, it might be rare that the coprocessor would set additional words
other than those already initialised by the caller.  In other words, while
this functionality would add additional granularity, there is no use case
for it yet.  So I would not hurry to add it.

> It's a bit more trickier for BPF_COPX because you need to pre-calculate
> loads/stores masks but it's doable.

Yes, it is run-time.  However, I would say let's not bother with it until
we actually have a need for an improvement (I doubt it will ever happen).

> At the moment, bpf_filter_ext() resets the "invalid" mask. This
> translates to .stores = BPF_COP_STORE_ALL.

Correct, that is why NPF is, temporarily, zeroing all words (until some
API gets adopted).


Home | Main Index | Thread Index | Old Index