Re: CVS commit: src/sys/arch/x86

To: "Cherry G.Mathew" <cherry%zyx.in@localhost>
Subject: Re: CVS commit: src/sys/arch/x86
From: Maxime Villard <max%m00nbsd.net@localhost>
Date: Wed, 13 Feb 2019 10:59:56 +0100

Le 13/02/2019 à 10:08, Cherry G.Mathew a écrit :

(resent to source-changes-d@)

"Maxime Villard" <maxv%netbsd.org@localhost> writes:

  - There is no recursive slot possible, so we can't use pmap_map_ptes().
    Rather, we walk down the EPT trees via the direct map, and that's
    actually a lot simpler (and probably faster too...).


Does this mean that nvmm hosts have to have __HAVE_DIRECT_MAP ?


Yes, and all of them do in practice by default (GENERIC), so it's not a
problem.

It becomes a problem on certain special configurations such as KASAN that
disable the direct map. In this case EPT is not compiled. So if you use
KASAN but don't use NVMM+Intel there is no problem.

In fact, maybe I should add direct map support in KASAN. Initially I didn't
do it because I wanted to force all kernel allocations into pmap_kenter_pa,
to have 100% KASAN coverage of the KVA. But if I was using two separate
flags, such as

	__HAVE_DIRECT_MAP = whether the kernel has a direct map
	__USE_DIRECT_MAP  = whether the allocators can use the direct map

In KASAN we could have __HAVE_DIRECT_MAP=1 and __USE_DIRECT_MAP=0, meaning
that EPT can use the direct map internally but nobody in UVM/etc can use
the direct map for allocations. This would maintain KASAN coverage and
would enable KASAN+EPT support.

  - The kernel is never mapped in an EPT pmap. An EPT pmap cannot be loaded
    on the host. This has two sub-consequences: at creation time we must
    zero out all of the top-level PTEs, and at destruction time we force
    the page out of the pool cache and into the pool, to ensure that a next
    allocation will invoke pmap_pdp_ctor() to create a native pmap and not
    recycle some stale EPT entries.


Can you not use a separate poolcache ? This could isolate host/guest
related memory pressure as well ?


The poolcache I was talking about is the one that stores the top-level
page of the page tables (PML4). It is only a 4KB page, and there is only
one such page per guest. Therefore there is no need to separate more,
one page per guest is pretty insignificant.

However, it is true that we could separate the caches of the non-top-level
page tables (L3, L2 and L1). But I think the interest is not huge: under
pressure UVM will unmap guest pages, and when it happens, we also free
the empty L3/L2/L1 pages, so the pressure is "naturally" reduced in the
shared poolcaches.

References:
- Re: CVS commit: src/sys/arch/x86
  - From: Cherry G . Mathew

Prev by Date: Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
Next by Date: Re: DIAGNOSTIC for modules (Re: CVS commit: src/sys/dev/usb)
Previous by Thread: Re: CVS commit: src/sys/arch/x86
Indexes:

Home | Main Index | Thread Index | Old Index