Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/sys/arch/x86



Module Name:    src
Committed By:   maxv
Date:           Wed Feb 13 08:38:25 UTC 2019

Modified Files:
        src/sys/arch/x86/include: pmap.h
        src/sys/arch/x86/x86: pmap.c

Log Message:
Add the EPT pmap code, used by Intel-VMX.

The idea is that under NVMM, we don't want to implement the hypervisor page
tables manually in NVMM directly, because we want pageable guests; that is,
we want to allow UVM to unmap guest pages when the host comes under
pressure.

Contrary to AMD-SVM, Intel-VMX uses a different set of PTE bits from
native, and this has three important consequences:

 - We can't use the native PTE bits, so each time we want to modify the
   page tables, we need to know whether we're dealing with a native pmap
   or an EPT pmap. This is accomplished with callbacks, that handle
   everything PTE-related.

 - There is no recursive slot possible, so we can't use pmap_map_ptes().
   Rather, we walk down the EPT trees via the direct map, and that's
   actually a lot simpler (and probably faster too...).

 - The kernel is never mapped in an EPT pmap. An EPT pmap cannot be loaded
   on the host. This has two sub-consequences: at creation time we must
   zero out all of the top-level PTEs, and at destruction time we force
   the page out of the pool cache and into the pool, to ensure that a next
   allocation will invoke pmap_pdp_ctor() to create a native pmap and not
   recycle some stale EPT entries.

To create an EPT pmap, the caller must invoke pmap_ept_transform() on a
newly-allocated native pmap. And that's about it, from then on the EPT
callbacks will be invoked, and the pmap can be destroyed via the usual
pmap_destroy(). The TLB shootdown callback is not initialized however,
it is the responsibility of the hypervisor (NVMM) to set it.

There are some twisted cases that we need to handle. For example if
pmap_is_referenced() is called on a physical page that is entered both by
a native pmap and by an EPT pmap, we take the Accessed bits from the
two pmaps using different PTE sets in each case, and combine them into a
generic PP_ATTRS_U flag (that does not depend on the pmap type).

Given that the EPT layout is a 4-Level tree with the same address space as
native x86_64, we allow ourselves to use a few native macros in EPT, such
as pmap_pa2pte(), rather than re-defining them with "ept" in the name.

Even though this EPT code is rather complex, it is not too intrusive: just
a few callbacks in a few pmap functions, predicted-false to give priority
to native. So this comes with no messy #ifdef or performance cost.


To generate a diff of this commit:
cvs rdiff -u -r1.96 -r1.97 src/sys/arch/x86/include/pmap.h
cvs rdiff -u -r1.321 -r1.322 src/sys/arch/x86/x86/pmap.c

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index