Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: i386 lazy pmap switching in trap.c



On Sun, Feb 10, 2008 at 02:49:50AM +0900, YAMAMOTO Takashi wrote:
> 
> > Although I'd also put a 'cmp' and 'jz' after the call earlier, which
> > is likely to be benefitialon athlons, but not P4 - which will predict
> > the backwards jump as taken if the branch isn't in the branch cache.
> 
> do you mean the following?
> 
>         1:                                              ; \
>         cmpl    $0, CPUVAR(WANT_PMAPLOAD)               ; \
>         jz      1f                                      ; \
>         call    _C_LABEL(pmap_load)                     ; \
>         cmpl    $0, CPUVAR(WANT_PMAPLOAD)               ; \
>         jz      1f                                      ; \
>         jmp     1b                                      ; \
>         1:

That version is bad, both amd and intel processors will mispredict
the second 'jz'.
Adding the second 'cmpl' is probably a gain (followed by a 'jnz').
But the intel processors will predict the backwards conditional
jump as taken (assuming is isn't in the brach cache).
(amd processors predict all conditional jumps as 'not taken').

But if we assume that the pmap is loaded most of the time, what you
actually need is:

        cmpl    $0, CPUVAR(WANT_PMAPLOAD)
        jnz     999f
    998:
        # rest of function

    999:
        call    _C_LABEL(pmap_load)
        jmp     988b


        David

-- 
David Laight: david%l8s.co.uk@localhost



Home | Main Index | Thread Index | Old Index