Subject: Re: x86 instructions reordering
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Lennart Augustsson <lennart@augustsson.net>
List: port-i386
Date: 03/24/2005 17:14:49
The x86 can do some reordering, but nothing should ever
move across an instruction with a lock prefix.  So with
enough mb() in your code you should be totally safe.

	-- Lennart


Manuel Bouyer wrote:
> Hi,
> can newer x86 CPUs (hyperthreaded p4 in my case) reorder instructions,
> or memory writes ? If so, how can we impose barriers ? I didn't find
> anything obvious in the x86 SMP code, beside bit atomic operations (which
> don't work in my case).
> 
> Basically I have these 2 pieces of code in xen (NetBSD and linux), one sender
> and one receiver, using a piece of shared memory.
> The receiver:                          |     The sender:
>                                        |
> handle_event()                         |     send()
> {                                      |     {
>                                        |             a = shared_memory->a;
> again:                                 |             do_something;
>         a = shared_memory->a;          |             wmb();
>         __insn_barrier();              |             shared_memory->a = a + 1;
>         b = shared_memory->b;          |             mb()
>         while (b < a) {                |             if (shared_memory->b == a)
>                 /* do something */     |                     send_event();
>                 did_something = 1;     |     }
>                 b++;                   |
>         }                              |
>         __insn_barrier();              |
>         shared_memory->b = b;          |
>         __insn_barrier();              |
>         if (did_something)             |
>                 goto again;            |
> }                                      |
> 
> The sender is a piece of linux code, mb() and wmb() are both
> __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")
> which is the same as our x86_lfence(). I tried remplacing __insn_barrier
> with x86_lfence but the assembly produced by gcc didn't change.
> 
> So basically, the sender send an event only if the receiver isn't already busy.
> But sometimes, the receiver stops and isn't getting an event. The only way I
> can see this happen is if the read and writes to memory
> don't happen in the intended order. This problem only occurs if the
> reader and writer are running on different CPUs of the HT P4. I couldn't
> reproduce this if I force both virtual machine to run on the same CPU, while
> it locks up quickly if each virtual machine runs on a different virtual CPU.
> 
> Any idea ?
> 
> --
> Manuel Bouyer <bouyer@antioche.eu.org>
>      NetBSD: 26 ans d'experience feront toujours la difference
> --
>