tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: softint-based if_input



On Tue, Feb 2, 2016 at 8:51 AM, Taylor R Campbell
<campbell+netbsd-tech-kern%mumble.net@localhost> wrote:
>    Date: Mon, 1 Feb 2016 14:41:03 +0900
>    From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
>
>    While I agree on providing a separate API and letting all drivers use
>    it finally, I don't agree on applying it to all drivers that need it
>    at this point. There are so many driver that we have to let use
>    softint-based if_input and they are much more than ones we don't need
>    to do. (I'm assuming your proposal needs to add if_percpuq to every
>    drivers' sc. Am I correct?) My patch intended to minimize the changes
>    to the drivers.
>
>    So I think it's better to adopt two approaches:
>    - Add if_percpuq to ifnet and let drivers use it by default (via if_attach)
>    - Provide a new API that you propose, and let drivers that are ready
>      for softint use it (we use if_initialize instead)
>    - Change drivers to use the new API one by one (it would take long time)
>    - Once we migrate all drivers, then remove if_percpuq from ifnet
>      (I'm assuming if_percpuq is in ifdef _KERNEL and removable)
>
>    Does this approach satisfy you (or not)?
>
> I wouldn't object to that, but I think you may overestimate the work
> to convert each driver to the API I sketched.  It would require
> basically a four-line change for each driver:
>
>  struct xyz_softc {
>  ...
> +       struct if_percpuq       *sc_ipq;
>  }
>
>  xyz_attach(...)
>  {
>         ...
> +       sc->sc_ipq = if_percpuq_create(&if_percpuq_input, &sc->sc_if);
>         ...
>  }
>
>  xyz_detach(...)
>  {
>         ...
> +       if_percpuq_destroy(sc->sc_ipq);
>         ...
>  }
>
>  xyz_rx_intr(...)
>  {
>         ...
> -       (*ifp->if_input)(ifp, m);
> +       if_percpuq_enqueue(&sc->sc_ipq, m);
>         ...
>  }

It's probably simple but time-consuming because there are more than
one and a half hundreds drivers. I just want to avoid spending time
for this task and go back to the main task (L3 MP-ification)...

>
>    > struct if_percpuq {
>    >         void            *ipq_si;
>    >         struct percpu   *ipq_ifqs;      /* struct ifqueue */
>    > };
>    >
>    > struct if_percpuq *
>    >         if_percpuq_create(void (*)(void *), void *);
>    > void    if_percpuq_destroy(struct if_percpuq *);
>    > void    if_percpuq_enqueue(struct if_percpuq *, struct mbuf *);
>
>    if_percpuq_create takes the same argument as softint_establish
>    and the first argument may be a driver-specific if_input function
>    or a some common if_percpuq_softint?
>
> What I had in mind was essentially mimicking the pktqueue(9) API -- I
> didn't finish the sketch, evidently.  We could have a common
> if_percpuq_input function for most drivers to use:
>
> void
> if_percpuq_input(struct if_percpuq *ipq, void *cookie)
> {
>         struct ifnet *ifp = cookie;
>         struct mbuf *m;
>         int s;
>
>         while ((m = if_percpuq_dequeue(ifq)) != NULL)
>                 (*ifp->if_input)(ifp, m);
> }
> ...
>         sc->sc_ipq = if_percpuq_create(&if_percpuq_input, &sc->sc_if);
>
> (This would require passing the if_percpuq to the callback, in
> addition to the per-user cookie.)  Or we could omit a callback
> altogether and just call ifp->if_input in a loop in the softint.

I choose the latter. I think we don't need to make callback
flexible for now.

>
>    I'm a bit worried if we have driver-specific if_input functions,
>    bpf_mtap hooks will be still scattered.
>
> That sounds like a problem we can fix in another pass.  It's not
> immediately obvious that we want to fold it into the same abstraction,
> but I haven't looked at how net80211 is different.

Well, actually if we can ensure all bpf hooks run in softint, putting
bpf_mtap in a common place is not must, but less bpf_mtap hooks make
the work easy :) Anyway net80211 needs special treatment; IIUC we
need to make rx_intr softint for each driver.

>
>    > The reason I suggest a separate API which drivers can voluntarily use
>    > is that I expect we may want alternatives -- e.g., drivers without
>    > multiqueue support may want to use a pktq instead and automatically
>    > distribute to other CPUs[*].
>
>    Do you assume we'll have another API such as if_pktq_*?
>
> Maybe more like an if_pktq_input callback that you can pass to
> pktq_create.  No need for much more than that, I think.

Oh, I see. Each driver basically just uses pktqueue by itself.

>
>    > Also: I think your patch is broken as is for all USB NICs -- all their
>    > `interrupt' routines actually run in softint context, if I'm not
>    > mistaken, so this kassert will fire:
>    >
>    > +void
>    > +if_input(struct ifnet *ifp, struct mbuf *m)
>    > +{
>    > +
>    > +       if (ifp->if_input_ifqs) {
>    > +               struct ifqueue *ifq = percpu_getref(ifp->if_input_ifqs);
>    > +
>    > +               KASSERT(cpu_intr_p());
>    >
>    > The same may go for some other NICs, such as se(4), for which there is
>    > a possible use from thread context which I can't rule out in thirty
>    > seconds.  Another possible case is xennet(4).
>
>    I didn't know them... I'll fix them.
>
>    BTW I shouldn't put KASSERT to warn constraint violations
>    and instead we should log once (by RUN_ONCE or something)?
>
> No, I would encourage using KASSERT, if that really is an essential
> part of the API contract.  Log messages like that usually conceal
> deeper problems and discourage anyone from fixing them properly.

Okay. I'll keep using KASSERT.

>
> In this case, it's not a priori clear to me why the caller must be in
> hardintr context.  Perhaps you want to discourage callers from using
> this if they're already in softint context.  But even if the caller is
> in softint context, maybe it's higher-priority softint context than
> softnet (e.g., I think skrll@ plans to make all USB softints run at
> softserial instead of softnet), and maybe you don't want ifp->if_input
> to run at >softnet priority.

Oh, that's a case I didn't take into account. Running if->if_input
at >softnet priority would break the assumption that we rely on.

We may be able to change the assert to
  KASSERT(cpu_intr_p() || (cpu_softintr_p() && level != SOFTINT_NET))
though, I think we don't need to be paranoid for the case.
(And I don't know how to get level. Is there API?).

  ozaki-r


Home | Main Index | Thread Index | Old Index