Subject: bad effect of keeping page table mapped in user space ?
To: None <tech-kern@netbsd.org>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 11/27/2007 00:35:46
Hi,
can anyone see a bad side effect or security issue of having a process's
page table mapped in the process's VM space ?

Because of the way Xen works on amd64, the kernel runs in ring 3 (same as
user processes); so we can't rely on PG_u to hide page table entries from
userspace processes (kernel memory is also mapped PG_u). It's not a problem
because the hypervisor switches address spaces on kernel entry/exit, and
%cr3 never points to a user page table when in kernel, and vice-versa.
This also means that pmap_map_ptes() will always use APTE when not called
on pmap_kernel(). So to map a user page table in kenrel space, we have to
have APTE points to the user L4 page, and have PTE filled in this L4 page.

The current code (for Xen) fills in the L4 PTE entry on pmap_map_ptes()
and clear it on pmap_unmap_ptes() though hypercalls. This makes pmap_extract()
really, really slow (like 10x slower). So my question about keeping the L4 PTE
entry valid on return to userspace, which makes it possible for a user process
to read its PTE entries (not write: an active page table is always mapped
read-only). I don't see a problem with it, but I may be missing somthing.

Here's the results on a 'make' in pkgsrc/pkgtools/digest:
first for a stock XEN3_DOMU kernel:
       74.26 real         3.20 user        58.15 sys
       72.64 real         3.42 user        57.87 sys
       72.73 real         3.66 user        57.72 sys
       72.50 real         3.54 user        57.73 sys
       72.63 real         3.36 user        57.98 sys

and for a XEN3_DOMU kernel which doesn't set/clear the L4 PTE entry on each
pmap_extract():
       13.43 real         5.42 user         5.41 sys
       11.92 real         5.38 user         5.47 sys
       11.87 real         5.52 user         5.33 sys
       11.87 real         5.48 user         5.36 sys
       11.88 real         5.34 user         5.51 sys

the performance gain is worth it :)

For record, the same test with a native NetBSD/amd64:
        5.56 real         3.60 user         1.76 sys
        5.56 real         3.60 user         1.75 sys
        5.55 real         3.55 user         1.79 sys
        5.57 real         3.59 user         1.76 sys
        5.56 real         3.52 user         1.84 sys

there's still room for improvements :)

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--