NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/60144: virtio(4) cache coherence issue
> Date: Sat, 11 Apr 2026 11:48:32 +0100
> From: Jason Thorpe <thorpej%me.com@localhost>
>
> Let me give a quick primer on them just so we're all on the same
> page regarding now they work.
That's all very interesting, but this is really a question about the
API rules for bus_dma.
In
if (buf->ready)
use(buf->data);
what bus_dmamap_sync barriers are needed to separate the loads of
buf->ready and buf->data? (Similarly if we load buf->ready in a
polling loop: what barriers are necessary in consecutive loads of
buf->ready?)
I have long thought that only POSTREAD is necessary here, since we
already started a DMA transfer into buf a long time ago, with the
matching PREREAD already issued before it started:
buf = alloc();
bus_dmamap_sync(buf, PREREAD);
bus_space_write_4(bst, bsh, TRIGGER_DMA, bufaddr); // start DMA
...
/* interrupt handler -- buf->ready _may_ have been updated */
bus_dmamap_sync(&buf->ready, POSTREAD);
if (buf->ready) {
/* buf->data _has definitely_ been updated */
bus_dmamap_sync(&buf->data, POSTREAD);
use(buf->data);
}
And I'm pretty sure we have a lot of drivers written like this (most
of which I did not write!). For example, here's the wm(4) rx
interrupt handler (Intel ethernet driver):
10217 static bool
10218 wm_rxeof(struct wm_rxqueue *rxq, u_int limit)
10219 {
...
10232 for (i = rxq->rxq_ptr;; i = WM_NEXTRX(i)) {
10233 rxs = &rxq->rxq_soft[i];
...
10238 wm_cdrxsync(rxq, i,
10239 BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
...
10250 if (!wm_rxdesc_dd(rxq, i, status))
10251 break;
...
10277 bus_dmamap_sync(sc->sc_dmat, rxs->rxs_dmamap, 0,
10278 rxs->rxs_dmamap->dm_mapsize, BUS_DMASYNC_POSTREAD);
10279
10280 m = rxs->rxs_mbuf;
...
10385 if_percpuq_enqueue(sc->sc_ipq, m);
https://nxr.netbsd.org/xref/src/sys/dev/pci/if_wm.c?r=1.801#10212
The call wm_rxdesc_dd corresponds to testing buf->ready, and the
subsequent use of the content of rxs->rxs_mbuf corresponds to
use(buf->data).
Here's the mfi(4) interrupt routine (SCSI driver):
1362 int
1363 mfi_intr(void *arg)
1364 {
...
1378 bus_dmamap_sync(sc->sc_dmat, MFIMEM_MAP(sc->sc_pcq), 0,
1379 sizeof(uint32_t) * sc->sc_max_cmds + sizeof(struct mfi_prod_cons),
1380 BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
1381
1382 producer = pcq->mpc_producer;
1383 consumer = pcq->mpc_consumer;
1384
1385 while (consumer != producer) {
...
1389 ctx = pcq->mpc_reply_q[consumer];
...
1397 ccb = &sc->sc_ccb[ctx];
...
1400 bus_dmamap_sync(sc->sc_dmat, MFIMEM_MAP(sc->sc_frames),
1401 ccb->ccb_pframe - MFIMEM_DVA(sc->sc_frames),
1402 sc->sc_frames_size,
1403 BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
1404 ccb->ccb_done(ccb);
https://nxr.netbsd.org/xref/src/sys/dev/ic/mfi.c?r=1.80#1362
Testing pcq->mpc_producer/consumer and loading pcq->mpc_reply_q
roughly corresponds to testing buf->ready, and the subsequent use of
ccb corresponding to pcq->mpc_reply_q[consumer] corresponds to using
buf->data.
The hifn(4) driver (crypto accelerator) does issue another PREREAD,
but not between testing buf->ready and using buf->data -- only in the
polling loop of repeatedly testing buf->ready, essentially:
1931 static int
1932 hifn_intr(void *arg)
1933 {
...
2000 HIFN_RESR_SYNC(sc, i,
2001 BUS_DMASYNC_POSTREAD | BUS_DMASYNC_POSTWRITE);
2002 if (dma->resr[i].l & htole32(HIFN_D_VALID)) {
2003 HIFN_RESR_SYNC(sc, i,
2004 BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
2005 break;
2006 }
...
2011 HIFN_RES_SYNC(sc, i, BUS_DMASYNC_POSTREAD);
2012 cmd = dma->hifn_commands[i];
...
2017 hifn_callback(sc, cmd, dma->result_bufs[i]);
https://nxr.netbsd.org/xref/src/sys/dev/pci/hifn7751.c#1931
Do we need to go through all these drivers and insert another PREREAD
between testing buf->ready and using buf->data? (And between repeated
tests of buf->ready?)
Note that all of these drivers _already_ issue POSTREAD barriers
between all the relevant loads. (There are some drivers that don't do
even that, and I'm willing to accept that those drivers are simply
buggy.)
Home |
Main Index |
Thread Index |
Old Index