Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Why does membar_consumer() do anything on x86_64?

On 10 Jul 2010, at 21:33 , Jean-Yves Migeon wrote:

> On 11.07.2010 05:19, Dennis Ferguson wrote:
>> Hello,
>> Unless I'm truly confused, here's what membar_consumer() and 
>> membar_producer()
>> do on an x86_64 processor:
>>    ENTRY(_membar_consumer)
>>            LOCK(25)
>>            addq    $0, -8(%rsp)
>>            ret
>>    ENDLABEL(membar_consumer_end)
>>    ENTRY(_membar_producer)
>>            /* A store is enough */
>>            movq    $0, -8(%rsp)
>>            ret
>>    ENDLABEL(membar_producer_end)
>> I'm trying to figure out why membar_consumer() does that, since the useless
>> read-modify-write is measurably quite expensive.  I'm also curious why
>> membar_producer() is implemented as the useless write.
> On amd64, IIRC, there is one exception regarding ordering; a quick
> search in the spec says that "Loads may be reordered with older stores
> to different locations."
> lock instructions have total order (loads and store are not reordered
> with lock instructions). As memory barriers always go by 2 (one
> consumer/reader, one producer/writer), the lock avoids load/store
> reordering to different locations; see, "Loads May Be Reordered
> with Earlier Stores to Different Locations."
> http://www.intel.com/Assets/PDF/manual/253668.pdf
> I could get it wrong though, anyway, it seems plausible to me...

I'm sure you are right that loads can be reordered with respect to stores,
so I'd expect membar_enter() and membar_sync(), and maybe membar_exit(), to
actually do something since (according to the man page) those functions
worry about the ordering of loads against stores.

I don't need these, though.  I have one write-only guy running concurrently
with N read-only guys, so the write-only guy only cares about the ordering
of stores while the read-only guys only care about the ordering of loads.
Since membar_consumer() (according to the man page) protects against
the reordering of loads alone, and membar_producer() protects against
the reordering of stores alone, these seem to be the functions I need
to use.  Since load-only and store-only order are also things that Intel
processors guarantee will happen if you don't do anything at all (other
architectures don't guarantee that, though, which is why I might want to
call these functions anyway), it still doesn't make sense to me why
membar_consumer() and membar_producer() in particular would need to do
anything at all on Intel CPUs.

Dennis Ferguson

Home | Main Index | Thread Index | Old Index