tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: uvm object page remove question



On Sat, May 16, 2020 at 05:24:32AM -0700, Chuck Silvers wrote:
> On Wed, May 13, 2020 at 08:20:15PM +0200, Manuel Bouyer wrote:
> > Hello,
> > for Xen I need some non-standard VM operation: the tools want to map
> > some Xen objects for which we don't have a physical address.
> > The map/unmap operations are done with hypercalls which does the
> > page table update. In my implementation the tools ask the kernel to
> > do this via a ioctl on /kern/xen/privcmds
> > 
> > When a tool wants to map one of these Xen object, it first does a
> > mmap() to get some virtual space, and then an ioctl to map it.
> > The ioctl allocates a uvm_object and uvm_map() it for the range.
> > The pgo_fault() handler will map the VA to the Xen object using the
> > dedictated hypercall; this works fine.
> > 
> > But I have a problem for unmap: when uvm wants to remove the mapping
> > (either because the process called munmap() or because it exited),
> > pmap_remove() will try to clear the page table entries for this
> > special mapping, and Xen will kill the VM. This has to be done with
> > the right hypercall.
> > 
> > I see 2 ways to fix this: add a pgo_remove() and have UVM call it,
> > or intercept the pmap_remove() calls at the pmap level, and check
> > the VA against our uvm_objects.
> > 
> > Is there something I missed to get a custom page remove for uvm ojbects ?
> > what would be the best way to handle this ?
> 
> I don't think you're missing anything...
> currently the kernel is always able to clear a PTE by simply writing a zero.
> 
> There are more cases to consider, eg. pmap_protect() on this magic mapping
> would probably kill the VM too, since that will also try to modify the PTE.

Not sure; maybe changing the protection bits would be allowed.

We're already more or less in this world with pmap_protect_ma() which takes
the remote domain as (optional) argument, just for this.

> 
> Are all of these mappings single pages?

Some are single pages, but some also contains several pages (I guess for e.g.
framebuffer, or loading the initial code).

> If not, does xen allow removing
> one page of a magic mapping and leaving the rest in place?

Yes

> 
> Why does xen feel it necessary to kill the VM in these cases
> where the guest is reducing its access to these magic pages?
> This would be a lot easier to deal with if xen just let the guest
> operate on these PTEs the same way it operates on other PTEs.

Sure. But I think Xen wants to track the usage, and eventually we
have to take some extra action on unmap too (like a notification sent to
the remote domain).

Also, exact way to do the mapping depends on the type of the remote domain
(PV vs HVM)

> 
> What are these magic mappings for, exactly?

Various memory mapped structures (like the xenstore), emulated devices
(e.g. framebuffer).


> 
> Would it be practical to just map the magic mappings in the kernel
> and then have the tools use ioctls to access the magic mappings?
> That would avoid all of these problems.

This would impose very depth modifications in the tools.


For now I implemented this re-using the hooks in x86/pmap.c for NVMM
and ept tables (pm_remove and pm_data) and it's enough for my needs.

I have another question: when we implment a umv_object, it is possible to
map the whole range at uvm_map() time ?
If you look at the IOCTL_PRIVCMD_MMAPBATCH code in sys/arch/xen/xen/privcmd.c
we validate the machine addresses using a temporaty mapping in kernel space.
The mapping in userland itself is done when a fault occurs.
It would be easier to map the whole range at ioctl() time, so that a
fault never occurs.
I guess I would need to allocate virtual pages registers them in the map ?

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index