NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/60144: virtio(4) cache coherence issue



The following reply was made to PR kern/60144; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Jason Thorpe <thorpej%me.com@localhost>
Cc: Tetsuya Isaki <isaki%pastel-flower.jp@localhost>,
	Nick Hudson <nick.hudson%gmx.co.uk@localhost>,
	gnats-bugs%netbsd.org@localhost,
	kern-bug-people%netbsd.org@localhost,
	netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/60144: virtio(4) cache coherence issue
Date: Sat, 11 Apr 2026 16:53:28 +0000

 I talked it over with Nick on ICB and he convinced me that this
 question about
 
 	buf = alloc()
 	PREREAD
 	trigger DMA
 	interrupt
 	POSTREAD
 	if (buf->ready) {
 		???
 		use(buf->data)
 	}
 
 is not quite analogous, though a polling loop around buf->ready may
 be analogous.  We had a similar issue with ehci on armv5:
 
 https://mail-index.NetBSD.org/source-changes-d/2025/09/29/msg014539.html
 
 Here's the virtio(4) logic in question:
 
 /* virtio_enqueue_commit */
    1255 		vq_sync_uring_all(sc, vq, BUS_DMASYNC_PREREAD);
 ...
    1259 		vq->vq_avail->idx = virtio_rw16(sc, vq->vq_avail_idx);
 ...
    1275 			vq_sync_uring_header(sc, vq, BUS_DMASYNC_POSTREAD);
    1276 			flags = virtio_rw16(sc, vq->vq_used->flags);
    1277 			if (!(flags & VRING_USED_F_NO_NOTIFY))
    1278 				sc->sc_ops->kick(sc, vq->vq_index);
 ...
 
 https://nxr.netbsd.org/xref/src/sys/dev/pci/virtio.c?r=1.84#1247
 
 /* virtio_vq_is_enqueued */
     609 	vq_sync_uring_header(sc, vq, BUS_DMASYNC_POSTREAD);
     610 	if (vq->vq_used_idx == virtio_rw16(sc, vq->vq_used->idx))
     611 		return 0;
     612 	vq_sync_uring_payload(sc, vq, BUS_DMASYNC_POSTREAD);
 
 https://nxr.netbsd.org/xref/src/sys/dev/pci/virtio.c?r=1.84#609
 
 The trouble is that the CPU issues an extra load at line 1276 _before
 the DMA operation may have actually been triggered_ at line 1278, and,
 more subtly, the order of operations by the virtio host device isn't
 quite the same in the two places where there's a (correct) POSTREAD
 barrier.
 
 For normal buf->ready / buf->data access, the protocol is more or
 less:
 
 host device model
 -----------------
 H1. store buf->data
 H2. (barrier)
 H3. set buf->ready := 1
 
 guest driver
 ------------
 G1. test buf->ready
 G2. POSTREAD
 G3. load buf->data
 
 And that guest driver logic is what happens in virtio_vq_is_enqueued,
 with a POSTREAD between testing buf->ready (vq->vq_used->idx) and
 using the data (uring payload).  So that part is fine.
 
 What's different with flags/idx is that the host device _can't_ start
 the DMA operation before setting flags.  So the full host side looks
 like:
 
 virtio host device
 ------------------
 H1'. set vq->vq_used->flags appropriately
 H2'. store-before-load barrier
 H3'. load vq->vq_avail->idx
 H4'. DMA operation to fill uring payload
 H5'. (barrier)
 H6'. store vq->vq_used->idx
 
 Note that the ordering here is reversed: while the simplified host
 device model separates H1 (store buf->data) by some barrier from H3
 (set buf->ready) _in that order_, the virtio host device sets
 vq->vq_used->flags before it can possibly even start the DMA
 operation.  (Then H4'/H5'/H6' correspond to the simplified host device
 model's H1/H2/H3.)
 
 So I'm convinced that our guest virtio(4) driver should issue another
 PREREAD barrier between loading vq->vq_used->flags and either
 (a) kicking the DMA, or
 (b) loading vq->vq_used->idx to check for DMA completion.
 
 --- src/sys/dev/pci/virtio.c
 +++ src/sys/dev/pci/virtio.c
 @@ -1274,6 +1274,7 @@
  		} else {
  			vq_sync_uring_header(sc, vq, BUS_DMASYNC_POSTREAD);
  			flags = virtio_rw16(sc, vq->vq_used->flags);
 +			vq_sync_uring_header(sc, vq, BUS_DMASYNC_PREREAD);
  			if (!(flags & VRING_USED_F_NO_NOTIFY))
  				sc->sc_ops->kick(sc, vq->vq_index);
  		}
 
 What I'm still not sure about -- until further thought after a lunch
 barrier at least -- is:
 
 1. Does a _polling loop_ require additional PREREADs?
 
 	PREREAD;
 	trigger DMA;
 	POSTREAD;
 	while (!buf->ready) {
 		PREREAD;	// <--- ???
 		POSTREAD;
 	}
 	POSTREAD;
 	use(buf->data);
 
    And, can the rule about this be phrased in terms of higher-level
    semantics of DMA operations than details of the 68030 cache
    implementation?
 
 2. Does the VIRTIO_F_RING_EVENT_IDX case require PREREAD too?
 
 --- src/sys/dev/pci/virtio.c
 +++ src/sys/dev/pci/virtio.c
 @@ -1269,6 +1269,7 @@
  		if (sc->sc_active_features & VIRTIO_F_RING_EVENT_IDX) {
  			vq_sync_uring_avail(sc, vq, BUS_DMASYNC_POSTREAD);
  			t = virtio_rw16(sc, *vq->vq_avail_event) + 1;
 +			vq_sync_uring_avail(sc, vq, BUS_DMASYNC_PREREAD);
  			if ((uint16_t) (n - t) < (uint16_t) (n - o))
  				sc->sc_ops->kick(sc, vq->vq_index);
  		} else {
 



Home | Main Index | Thread Index | Old Index