Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/crypto/external/bsd/openssl/lib/libcrypto/arch



On Mon, Jul 25, 2011 at 08:38:13PM +0200, Joerg Sonnenberger wrote:
> On Mon, Jul 25, 2011 at 07:24:57PM +0100, David Laight wrote:
> > On Mon, Jul 25, 2011 at 11:52:52AM +0200, Joerg Sonnenberger wrote:
> > > Much better. One thing remains. It would be nice to replace
> > >   .byte 0xf3,0xc3
> > > with either a simple ret or a ret $0, depending on whether it has a
> > > label on it or not. The reason for this mess seems to be a bug in
> > > certain generation of AMD CPUs. So essentially,
> > 
> > IIRC it is something to do with branch prediction?
> > But my memory keeps thinking of a constraint about the number
> > of branches/labels in a cache line - and I'm sure the non-use of
> > 1 byte return instructions was all related.
> 
> When I asked around, I get the following reference, which seems to
> summarize the situation nicely:
> 
> http://mikedimmick.blogspot.com/2008/03/what-heck-does-ret-mean.html

That is sort of consistent with what I remember from those guides.
I wonder what the additional cost of 'rep ret' and 'ret $0' is
on other cpus (apart from the obvious extra code byte).

Looking at the code (now with fewer 'rep ret') I notice that a fair
number of the jumps are unconditional - why have an unconditional jump
to a return instruction!
I also haven't checked what the critical paths are, and what the static
predicton will do! I also don't know the cycle times of these special
instructions to know how much it really matters!

        David

-- 
David Laight: david%l8s.co.uk@localhost


Home | Main Index | Thread Index | Old Index