Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/sys/arch



Module Name:    src
Committed By:   maxv
Date:           Sat Feb 11 14:11:25 UTC 2017

Modified Files:
        src/sys/arch/x86/include: cpu.h pmap.h
        src/sys/arch/x86/x86: cpu.c pmap.c
        src/sys/arch/xen/x86: cpu.c

Log Message:
Instead of using a global array with per-cpu indexes, embed the tmp VAs
into cpu_info directly. This concerns only {i386, Xen-i386, Xen-amd64},
because amd64 already has a direct map that is way faster than that.

There are two major issues with the global array: maxcpus entries are
allocated while it is unlikely that common i386 machines have so many
cpus, and the base VA of these entries is not cache-line-aligned, which
mostly guarantees cache-line-thrashing each time the VAs are entered.

Now the number of tmp VAs allocated is proportionate to the number of CPUs
attached (which therefore reduces memory consumption), and the base is
properly aligned.

On my 3-core AMD, the number of DC_refills_L2 events triggered when
performing 5x10^6 calls to pmap_zero_page on two dedicated cores is on
average divided by two with this patch.

Discussed on tech-kern a little.


To generate a diff of this commit:
cvs rdiff -u -r1.67 -r1.68 src/sys/arch/x86/include/cpu.h
cvs rdiff -u -r1.61 -r1.62 src/sys/arch/x86/include/pmap.h
cvs rdiff -u -r1.122 -r1.123 src/sys/arch/x86/x86/cpu.c
cvs rdiff -u -r1.239 -r1.240 src/sys/arch/x86/x86/pmap.c
cvs rdiff -u -r1.108 -r1.109 src/sys/arch/xen/x86/cpu.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index