NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-i386/38935: x86 bus_dmamap_sync() should use memory barrier



The following reply was made to PR port-i386/38935; it has been noted by GNATS.

From: Andrew Doran <ad%netbsd.org@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc: 
Subject: Re: port-i386/38935: x86 bus_dmamap_sync() should use memory barrier
Date: Tue, 10 Jun 2008 22:16:05 +0100

 On Tue, Jun 10, 2008 at 06:30:00PM +0000, Manuel Bouyer wrote:
 
 >      the x86 implementation of bus_dmamap_sync() turn into a NOP if
 >      we're not using bouce-buffers (which is the common case).
 >      Yet some DMA devices rely on the read or write operations
 >      happening in order (e.g. when the driver can add DMA descripors to
 >      a queue while the device may also be looking at the queue).
 >      Typical examples are the uhci and ehci drivers (but it's not
 >      the only example in our tree). Without appropriate memory barrier,
 >      the CPU's write buffers can reorder writes in a may that makes
 >      the DMA descriptors in an inconsistent state from the device's view.
 
 Writes are never reordered by the CPU unless the target memory region has
 been set as one of the weakly ordered types using an MTRR or PAE equivalent.
 For write-combining regions it could happen but that should never be the
 case for memory mapped for a device. In theory it can also happen with a
 write-back (Intel ordering) region if you use the SSE instructions, but we
 only use those in kernel for zeroing free pages.
 
 The tech docs from Intel are really confused about this issue, but the
 memory ordering rules are (in short):
 
 - loads can be reordered around other loads in any direction
 - a load after a store in program order can be reordered to occur before
   the store
 - a load before a store in program order _cannot_ be reordered to occur
   after the store
 - a store is strongly ordered with respect to all other stores
 - locked transactions prevent all reordering and flush store buffers
 
 amd64 is (was?) pretty much the same, except loads are never reordered
 around stores. 
 
 > @@ -844,6 +854,24 @@
 >              printf("unknown buffer type %d\n", cookie->id_buftype);
 >              panic("_bus_dmamap_sync");
 >      }
 > +end:
 > +    if ((ops & (BUS_DMASYNC_PREWRITE|BUS_DMASYNC_POSTWRITE)) &&
 > +        (ops & (BUS_DMASYNC_PREREAD|BUS_DMASYNC_POSTREAD))) {
 > +            /* both read and write, we need a full barrier */
 > +            x86_mfence();
 > +    } else if (ops & (BUS_DMASYNC_PREWRITE|BUS_DMASYNC_POSTWRITE)) {
 > +            /*
 > +             * all past writes should have completed before this point,
 > +             * and futur writes should not have started yet.
 > +             */
 > +            x86_sfence();
 > +    } else if (ops & (BUS_DMASYNC_PREREAD|BUS_DMASYNC_POSTREAD)) {
 > +            /*
 > +             * all past reads should have completed before this point,
 > +             * and futur reads should not have started yet.
 > +             */
 > +            x86_lfence();
 > +    }
 
 I think x86_lfence() without any if() would be enough. Does it solve the
 problem for you?
 
 Andrew
 


Home | Main Index | Thread Index | Old Index