tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: NetBSD 5.1 TCP performance issue (lots of ACK)

On Wed, Nov 23, 2011 at 03:25:36PM -0800, Dennis Ferguson wrote:
> > 
> > This is clearly not my experience. I can say for sure that without lfence
> > instructions, the xen front/back drivers are not working properly
> > (and I'm not the only one saying this).
> I am very sure that adding lfence() calls to that code fixed it.  What I
> suspect is that you don't understand why it fixed it, since I'm pretty 
> positive the
> original problem couldn't have been an Intel CPU reordering reads from cached
> memory.  For example if the thing you did to generate the instruction was
> either a function call or an `asm volatile ("lfence":::"memory")' it will
> have effects beyond just adding the instruction and those effects, rather
> than the instruction, might be what mattered.

The change also separated the two reads by a lot more C code - which
would in itself change the timings.

I do remember some docs going way back to the early pentium days
that implied that some reads/writes might happen out of sequence.
But those rules would actually have broken a lot of legacy code
- so were probably never actually implemented.

Some of the 'fence' instructions might be required for correct
sequencing between cached and uncached operations - eg to
ensure that a write to cached memory is snoopable before an IO
write is seen.

IIRC the code in question shouldn't need any kind of barrier
- provided the compiler generated the reads in the correct order
which I believe it is required to do for volatile data.

I'd look at the order of the reads in the faulty code (I think they've
been noted to be reversed), then add 'asm volatile ("":::"memory")'
between the them (without changing anything else), recompile and retest.


David Laight:

Home | Main Index | Thread Index | Old Index