tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: RFC: vioif(4) multiqueue support



Hi all,

I'm going to commit the changes on this weekend.

Regards,

Yamaguchi

2018年12月28日(金) 15:36 Shoichi Yamaguchi <yamaguchi%netbsd.org@localhost>:

>
> > > > > https://ftp.netbsd.org/pub/NetBSD/misc/yamaguchi/vioif_mutilq.patch
>
> I have updated the patch.
>
>
> 2018年12月26日(水) 17:20 s ymgch <s.ymgch228%gmail.com@localhost>:
> >
> > Thank you for your helpful comments!
> >
> >
> > 2018年12月26日(水) 11:37 Taylor R Campbell <campbell+netbsd-tech-net%mumble.net@localhost>:
> > >
> > > > Date: Wed, 26 Dec 2018 10:03:15 +0900
> > > > From: Shoichi Yamaguchi <yamaguchi%netbsd.org@localhost>
> > > >
> > > > > I implemented a patch that make vioif(4) support multi-queue. And I have put
> > > > > the patch on ftp.n.o. I used vioif(4) multiqueue on qemu-kvm on Linux kernel
> > > > > 4.19.5. And It seems to be working fine.
> > > > >
> > > >
> > > > Do you have any comments?
> > > > I would like to going to commit the patch if there is no comment until tomorrow.
> > >
> > > Hi, Yamaguchi-san!  A lot of Americans and Europeans are on vacation
> > > this time of year, so it might be better to hold off for another week
> > > or two.  Here's a quick review -- I don't know anything about virtio,
> > > so this is just about use of kernel APIs and abstractions.  Someone
> > > who knows something about virtio should take a look too.
> >
> > Yes, indeed. I'll wait for other comments for more one or two week.
> >
> > >
> > > > diff --git a/sys/dev/pci/if_vioif.c b/sys/dev/pci/if_vioif.c
> > > > index 3bbd300e88e..769b108e793 100644
> > > > --- a/sys/dev/pci/if_vioif.c
> > > > +++ b/sys/dev/pci/if_vioif.c
> > > >
> > > >  /* Feature bits */
> > > > -#define VIRTIO_NET_F_CSUM    (1<<0)
> > > > [...]
> > > > +#define VIRTIO_NET_F_MQ                      (1<<22)
> > >
> > > If you're going to modify all of the lines here, maybe take the
> > > opportunity to convert them to use __BIT?
> > >
> > > > @@ -171,73 +184,110 @@ struct virtio_net_ctrl_vlan {
> > > > [...]
> > > >  /*
> > > >   * if_vioifvar.h:
> > > >   */
> > > > +struct vioif_txqueue {
> > > > +     struct virtqueue        *txq_vq;
> > >
> > > Why not make this an embedded structure?
> >
> > The reason why I don't use "struct virtqueue" but use "struct virtqueue *"
> > is to register a virtqueue array ("struct virtqueue []") to virtio(4)
> > as an argument
> > of virtio_child_attach_start() or virtio_child_attach_set_vqs(). Virtqueues used
> > in a child device of virtio(4) like vioif(4) must be array.
> >
> > >
> > > struct vioif_txqueue {
> > >         struct virtqueue        txq_vq;
> > >         ...
> > > };
> > >
> > > struct vioif_softc {
> > >         struct vioif_txqueue    *sc_txq;
> > >         struct vioif_rxqueue    *sc_rxq;
> > >         struct vioif_ctrlqueue  *sc_ctrlq;
> > >         ...
> > > };
> > >
> > > > +     kmutex_t                *txq_lock;
> > >
> > > Why is this a pointer to kmutex_t with mutex_obj_alloc/free and not
> > > just a kmutex_t with mutex_init/destroy?  Is it reused anywhere?  If
> > > it is reused, this needs explanation in the comments.  If it is not,
> > > just use kmutex_t.
> >
> > It is for the error handling code.
> > Example:
> >     for (...) {
> >         txq[i]->txq_lock = mutex_obj_alloc();
> >         s = softint_establish()
> >         if (s == NULL)
> >             goto err;
> >     }
> >     ....
> >     err:
> >         for (...)
> >             if (txq[i]->txq_lock)
> >                 mutex_obj_free(txq->txq_lock);
> >
> > I use the pointer value as a flag weather the lock already allocated or not, and
> > I didn't want to add a field into vioif_txqueue to save it.
> >
> > >
> > > Can you write a comment summarizing what locks cover what fields, and,
> > > if more than one lock can be held at once, what the lock order is?
> >
> > I'll add comments for the locks.
> > Currently, the locks cover all fields in the structure, and two or more
> > than locks can't be held at once.
> >
> > >
> > > > +struct vioif_rxqueue {
> > > > +     struct virtqueue        *rxq_vq;
> > > > +     kmutex_t                *rxq_lock;
> > >
> > > Likewise.
> > >
> > > > -#define VIOIF_TX_LOCK(_sc)   mutex_enter(&(_sc)->sc_tx_lock)
> > > > -#define VIOIF_TX_UNLOCK(_sc) mutex_exit(&(_sc)->sc_tx_lock)
> > > > -#define VIOIF_TX_LOCKED(_sc) mutex_owned(&(_sc)->sc_tx_lock)
> > > > -#define VIOIF_RX_LOCK(_sc)   mutex_enter(&(_sc)->sc_rx_lock)
> > > > -#define VIOIF_RX_UNLOCK(_sc) mutex_exit(&(_sc)->sc_rx_lock)
> > > > -#define VIOIF_RX_LOCKED(_sc) mutex_owned(&(_sc)->sc_rx_lock)
> > > > +#define VIOIF_TXQ_LOCK(_q)   mutex_enter((_q)->txq_lock)
> > > > +#define VIOIF_TXQ_TRYLOCK(_q)        mutex_tryenter((_q)->txq_lock)
> > > > +#define VIOIF_TXQ_UNLOCK(_q) mutex_exit((_q)->txq_lock)
> > > > +#define VIOIF_TXQ_LOCKED(_q) mutex_owned((_q)->txq_lock)
> > > > +
> > > > +#define VIOIF_RXQ_LOCK(_q)   mutex_enter((_q)->rxq_lock)
> > > > +#define VIOIF_RXQ_UNLOCK(_q) mutex_exit((_q)->rxq_lock)
> > > > +#define VIOIF_RXQ_LOCKED(_q) mutex_owned((_q)->rxq_lock)
> > >
> > > Can we just use mutex_enter/exit/&c. without the macros?  Sometimes we
> > > use macros where they are conditional, depending on NET_MPSAFE, but if
> > > there's no need for that, I would prefer to read direct calls to
> > > mutex_enter/exit/&c.
> > >
> > > > +static int
> > > > +vioif_alloc_queues(struct vioif_softc *sc)
> > > > +{
> > > > +     int nvq_pairs = sc->sc_max_nvq_pairs;
> > > > +     int nvqs = nvq_pairs * 2;
> > > > +     int i;
> > > > +
> > > > +     sc->sc_rxq = kmem_zalloc(sizeof(sc->sc_rxq[0]) * nvq_pairs,
> > > > +         KM_NOSLEEP);
> > > > +     if (sc->sc_rxq == NULL)
> > > > +             return -1;
> > > > +
> > > > +     sc->sc_txq = kmem_zalloc(sizeof(sc->sc_txq[0]) * nvq_pairs,
> > > > +         KM_NOSLEEP);
> > > > +     if (sc->sc_txq == NULL)
> > > > +             return -1;
> > >
> > > Check to avoid arithmetic overflow here:
> > >
> > >         if (nvq_pairs > INT_MAX/2 - 1 ||
> > >             nvq_pairs > SIZE_MAX/sizeof(sc->sc_rxq[0]))
> > >                 return -1;
> > >         nvqs = nvq_pairs * 2;
> > >         if (...) nvqs++;
> > >         sc->sc_rxq = kmem_zalloc(sizeof(sc->sc_rxq[0]) * nvq_pairs, ...);
> > >
> > > Same in all the other allocations like this.  (We should have a
> > > kmem_allocarray -- I have a draft somewhere.)
> >
> > nvq_pairs is always less than VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX(= 0x8000).
> > So, I'll add KASSERT(nvq_pairs <=  VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX).
> >
> > >
> > > > @@ -586,69 +759,109 @@ vioif_attach(device_t parent, device_t self, void *aux)
> > > > [...]
> > > > +             /* Limit the number of queue pairs to use */
> > > > +             if (sc->sc_max_nvq_pairs <= ncpu)
> > > > +                     sc->sc_req_nvq_pairs = sc->sc_max_nvq_pairs;
> > > > +             else
> > > > +                     sc->sc_req_nvq_pairs = ncpu;
> > >
> > > How about sc->sc_req_nvq_pairs = MIN(sc->sc_max_nvq_pairs, ncpu)?
> > >
> > > > +static void
> > > > +vioif_ctrl_release(struct vioif_softc *sc)
> > > > +{
> > > > +     struct vioif_ctrlqueue *ctrlq = &sc->sc_ctrlq;
> > > > +
> > > > +     mutex_enter(&ctrlq->ctrlq_wait_lock);
> > >
> > > KASSERT(ctrlq->ctrlq_inuse != FREE)
> > >
> > > Might be helpful to record the lwp that owns this ctrlq, too, for
> > > diagnostics: KASSERT(ctrlq->ctrlq_owner == curlwp).
> > >
> > > > diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c
> > > > index 65c5222b774..bb972997be2 100644
> > > > --- a/sys/dev/pci/virtio_pci.c
> > > > +++ b/sys/dev/pci/virtio_pci.c
> > > > @@ -604,8 +677,14 @@ virtio_pci_setup_interrupts(struct virtio_softc *sc)
> > > > [...]
> > > >       if (pci_intr_type(pc, psc->sc_ihp[0]) == PCI_INTR_TYPE_MSIX) {
> > > > -             psc->sc_ihs = kmem_alloc(sizeof(*psc->sc_ihs) * 2,
> > > > +             psc->sc_ihs = kmem_alloc(sizeof(*psc->sc_ihs) * nmsix,
> > > >                   KM_SLEEP);
> > >
> > > Check for arithmetic overflow here.
> >
> > I'll take in other comments to the patch.
> >
> > Thanks,
> > Yamaguchi



Home | Main Index | Thread Index | Old Index