NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/60144: virtio(4) cache coherence issue



The following reply was made to PR kern/60144; it has been noted by GNATS.

From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
To: Tetsuya Isaki <isaki%pastel-flower.jp@localhost>
Cc: Jason Thorpe <thorpej%NetBSD.org@localhost>,
	netbsd-bugs%NetBSD.org@localhost, gnats-bugs%NetBSD.org@localhost
Subject: Re: kern/60144: virtio(4) cache coherence issue
Date: Mon, 30 Mar 2026 03:47:22 +0000

 > Date: Mon, 30 Mar 2026 02:53:00 +0000 (UTC)
 > From: Tetsuya Isaki <isaki%pastel-flower.jp@localhost>
 >=20
 > At least in virtio_is_enqueue(), I suspect that there may be some
 > insufficient cache line invalidation.
 >=20
 >  sys/dev/pci/virtio.c:
 >     601 virtio_vq_is_enqueued(struct virtio_softc *sc, struct virtqueue *=
 vq)
 >     602 {
 >     603
 >     604     if (vq->vq_queued) {
 >     605         vq->vq_queued =3D 0;
 >     606         vq_sync_aring_all(sc, vq, BUS_DMASYNC_POSTWRITE);
 >     607     }
 >     608
 >     609     vq_sync_uring_header(sc, vq, BUS_DMASYNC_POSTREAD);
 >     610     if (vq->vq_used_idx =3D=3D virtio_rw16(sc, vq->vq_used->idx))
 >     611         return 0;
 >     612     vq_sync_uring_payload(sc, vq, BUS_DMASYNC_POSTREAD);
 >     613     return 1;
 >     614 }
 >=20
 > The virtio device incremented its own index and wrote it to
 > vq->vq_used->idx.  But when the interrupt was lost, (un)luckily
 > the data cache still held the previous vq->vq_used->idx, so that
 > CPU read it from the cache(!).
 > As you know, the previous vq->vq_used->idx is the same as
 > vq->vq_used_idx, therefore the function returned 0 (which means
 > vq is empty), even though the device notified as vq-is-enqueued.
 >=20
 > I think that vq_sync_uring_header(sc, vq, BUS_DMASYNC_PREREAD) is
 > necessary to invalidate the cache line before reading fresh
 > vq->vq_used->idx (at line 610) ?
 
 I believe you have got the usage model of the preread vs postread
 wrong -- it is confusing, because `PREREAD' does not mean `before we
 read from the buffer' but rather `before the device does a DMA read
 operation to initialize the buffer'.
 
 Suppose a typical driver wants to request a DMA operation to read
 data, from network or disk or whatever, into a buffer in memory.  The
 idiom the driver should use is roughly:
 
 0. (allocate buffer)
 1. BUS_DMASYNC_PREREAD
 2. trigger DMA read operation
 3. wait for completion / interrupt handler
 4. BUS_DMASYNC_POSTREAD
 5. use data in buffer
 
 	/*
 	 * 1. Prepare the buffer in memory -- usually, this is a
 	 *    matter of forcing any pending stores that the CPU issued
 	 *    so that they happen happen before the DMA operation
 	 *    begins; this way, they don't accidentally overwrite the
 	 *    result of the device's DMA operation.
 	 *
 	 *    For bounced DMA operations, no memcpy is needed here.
 	 */
 	bus_dmamap_sync(BUS_DMASYNC_PREREAD);
 
 	/*
 	 * 2. Trigger the DMA read operation.
 	 */
 	bus_space_write(DMA_CTRL_REG, TRIGGER_DMA);
 
 	/*
 	 * 3. Wait for the DMA operation to complete.
 	 */
 	while (bus_space_read(DMA_STATUS_REG) !=3D DMA_DONE) {
 		wait for interrupt;
 	}
 
 	/*
 	 * 4. Make sure what we next read out of the buffer is
 	 *    up-to-date, i.e., the result of the DMA operation,
 	 *    rather than anything the CPU had previously cached from
 	 *    the memory where the buffer lives -- usually, this is a
 	 *    matter of flushing any cached data so the next time the
 	 *    CPU loads from memory, it doesn't get anything stale
 	 *    from the cache but reads from main memory instead.
 	 *
 	 *    For bounced DMA operations, this is where we memcpy from
 	 *    the bounce buffer to the real buffer.
 	 */
 	bus_dmamap_sync(BUS_DMASYNC_POSTREAD);
 
 	/*
 	 * 5. It is now safe to use the result of the DMA read.
 	 */
 	printf("stuff 0x%x\n", buf[0]);
 
 Here, the `DMA read operation' of virtio is to initialize
 vq->vq_used->idx.
 
 If some kind of cache invalidation must happen between the interrupt
 delivery or DMA_STATUS_REG read at (3), and the use of the data in the
 buffer at (5), then the m68k and virt68k bus_dmamap_sync logic might
 be missing something for BUS_DMAMAP_POSTREAD -- it seems to have cases
 only for BUS_DMAMAP_PREWRITE and BUS_DMAMAP_PREREAD:
 
 https://nxr.NetBSD.org/xref/src/sys/arch/m68k/m68k/bus_dma.c?r=3D1.41#491
 https://nxr.NetBSD.org/xref/src/sys/arch/virt68k/virt68k/bus_dma.c?r=3D1.4#=
 476
 
 Perhaps m68k needs new POSTREAD logic to be written, or perhaps there
 was a mistake and the PREREAD logic should have been for POSTREAD?
 
 PREREAD is generally only needed if the CPU can reorder stores; if the
 CPU issues all stores in order (to regular memory and I/O memory
 alike), then logic like the following likely doesn't need to do
 anything at the PREREAD point:
 
 	buf[0] =3D 0xdeadbeef;			/* last user of buf */
 			/* PREREAD */
 	*(memt + DMA_CTRL_REG) =3D TRIGGER_DMA;	/* bus_space_write */
 
 But, for a device that doesn't participate in the CPU's cache
 coherence protocol, logic like the following almost always needs
 something at the POSTREAD point so the CPU doesn't operate on stale
 cached data:
 
 	x =3D buf[0];				/* caches buf[0] */
 	dostuff(x);				/* last user of buf */
 	free(buf);
 	...
 	buf =3D alloc();				/* same buf reused */
 			/* PREREAD */
 	*(memt + DMA_CTRL_REG) =3D TRIGGER_DMA;	/* bus_space_write */
 	...
 						/* bus_space_read */
 	while (*(memt + DMA_STATUS_REG) !=3D DMA_DONE) wait;
 			/* POSTREAD */
 	printf("stuff 0x%x\n", buf[0]);		/* cached buf[0]? */
 



Home | Main Index | Thread Index | Old Index