tech-kern: Re: bus_dmamap_sync not enough ?

Subject: Re: bus_dmamap_sync not enough ?
To: Stephan Uphoff <ups@stups.com>
From: Jason R Thorpe <thorpej@wasabisystems.com>
List: tech-kern
Date: 02/11/2003 17:57:26
On Tue, Feb 11, 2003 at 03:42:24PM -0500, Stephan Uphoff wrote:

 > I missed that you wrote the driver - excellent then I can point out some

To clarify -- I didn't write the original i82557 driver -- that was
David Greenman @ FreeBSD.  I did, however, add the bus_dma support
to our driver.

 > of the details. (and btw, thanks for all the great code !)
 > 
 > The bus_dma(9) manual states for bus_dmamap_sync:
 > 	
 > 	This function must be called to synchronize DMA buffers before
 >         and after a DMA operation.
 > 	[ ... ]	
 > 	If DMA read and write operations are not preceded and followed by the appro-
 >         priate synchronization operations, behavior is undefined.
 > 
 > If the manual is correct then fxp_rxintr() has a race condition as a DMA read 
 > operation
 > might happen for example between lines  "/* ## A ## */" and "/* ## B ## */"
 > 
 > File i82257.c:
 > fxp_rxintr() 
 > 
 > 	[...]	 
 > 		FXP_RFASYNC(sc, m,
 > 		    BUS_DMASYNC_POSTREAD|BUS_DMASYNC_POSTWRITE);
 > 
 > 		/* ## A ##*/
 > 		rxstat = le16toh(rfa->rfa_status);
 > 
 > 		if ((rxstat & FXP_RFA_STATUS_C) == 0) {
 > 			/*
 > 			 * We have processed all of the
 > 			 * receive buffers.
 > 			 */
 > 			 /* ## B ## */
 > 			FXP_RFASYNC(sc, m, BUS_DMASYNC_PREREAD);
 > 			return;
 > 		}
 > 	[...]
 > 
 > There are several more examples in the driver with possible read AND write DMA
 > operations on buffers last synchronized with  BUS_DMASYNC_POSTREAD|BUS_DMASYNC_
 > POSTWRITE
 > operations.

Hm.  I see what you're saying, but I don't think there is actually a
problem; this is more likely to be a deficiency in the documentation.

So, in your example above, we know that the descriptor has already been
PREREAD|PREWRITE sync'd, and we know that was done before ownership was
given to the chip.  We then POSTREAD|POSTWRITE the descriptor, which
forms a memory barrier, meaning it is now safe to read the descriptor.

If we see that the descriptor is not yet marked as completed, it is not
only safe to only PREREAD sync the descriptor, it is necessary for correct
operation.  Since we did not modify the descriptor (since we did not know
if it was safe for us to do so), we have no need to PREWRITE (a PREWRITE
was performed the last time we modified the descriptor).  Furthermore, if
we were to PREWRITE sync again here, a system using bounce buffers would
copy the CPU's copy of the descriptor back, possibly (probably!) clobbering
the updated copy of the descriptor that the chip has; i.e., we would lose
the C bit.

 > If the driver code is correct (and the manual page is wrong) the architecture
 > specific implementation of _bus_dmamap_sync for i386 must be converted from
 > a no-op to a serializing assembly instruction for BUS_DMASYNC_POSTREAD 
 > synchronization.
 > ( Otherwise you might get out of order CPU reads of in order DMA read 
 > operations )

Err... the i386 version makes a run-time (not compile-time!) test for
a NULL pointer, and jumps through it if non-NULL.  This should act as
a memory clobber for the compiler whether or not the function pointer
is NULL, thus providing the serializing instruction (I agree with you
that it would be bad for the compiler to reorder in this case, I just
think that the compiler already has information that tells it not to
do so :-)

 > I also fail to see how bounce buffers (32bit PCI device and more than 4GB main 
 > memory)
 > would ever work if bus_dmamap_sync(... BUS_DMASYNC_PREWRITE) is allowed to be 
 > executed
 > concurrently with a DMA read operation on the same buffer.

You are correct, it cannot work correctly (hm, now there's a clumsy
sentence :-).  This is why the sync at marker B is only a PREREAD
(which in the case of bounce buffers is a noop, and in the case of
a non-coherent cache is an invalidate-no-writeback).

Out of curiosity, are you chasing an actual bug you've experienced, or
is this purely a result of code observation?

-- 
        -- Jason R. Thorpe <thorpej@wasabisystems.com>