Port-i386 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: PAE and balloon benchmarks



On 12.07.2010 17:09, Antti Kantee wrote:
> Hi,
> 
> Thanks for taking the time to throughly benchmark this.
> 
> A remarks/questions out of pure interest:
> 
> * Is the cost expected?  Naiively thinking, 15-20% seems quite high.
>   How much do other systems pay for PAE?  Did you attempt to pinpoint
>   where extra time is spent?

Nope, I did not try to pin point the issue. I suspect it is a
combination of those:

- 64 bits atomic ops when manipulating PG/PD on a 32 bits arch

- pmap context switch, where for native case, you just reload %cr3. For
PAE, you update all the entries in the L3 (at IPL_VM level), then
tlbflush. I suspect the tlbflush to be costly there.

While fork/pthread_create intesive benchmarks are likely to be affected,
I am a bit more concerned by the memory bandwidth one from sysbench. I
think I'll have to dig further into it.

I just noticed that the last L3 entry for the kernel is not marked
global, and is modified upon each context switch. I'll change that in my
patch.

> * How jittery were the results?  I'd be careful to not dismiss performance
>   degradation of several % as "noise" without just cause.  Those percent
>   might be important to someone!

Point taken :) Initially, I wanted to draw standard deviation on the
diagrams, but they were so small I arbitrarily decided they do not
matter. I have more detailed results if you want, even for 10+ runs, the
results are +-3% from the mean value.

> * Did you do any hot cache build.sh testing?  IMHO that is at least as
>   interesting, if not more interesting, than the coldstart one.

Nope, but I can do that; suggestions?

-- 
Jean-Yves Migeon
jeanyves.migeon%free.fr@localhost




Home | Main Index | Thread Index | Old Index