tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Problems with UVM pagefaults



On Tue, Jan 08, 2013 at 08:16:30PM +0100, Roger Pau Monn wrote:
> So far I've been able to get the ptes, pass them to Xen and stablish the
> mapping. Writing to that memory area from userspace seems to work fine
> (using pread), but the problem comes when the userspace program executes
> something like:
> 
> pwrite(fd, buf,...)
> 
> Where "buf" is a region of the memory mapped by the gnt device. This
> triggers a page fault in UVM, and this fault will try to modify the pte
> of the mapped memory region. This pte should not be modified, because if
> we modify the content of the pte, Xen will probably complain and crash,
> and if Xen doesn't crash we won't be able to unmap the pte later on,
> since the pte doesn't contain the value that Xen expects.
> 
> I've added a little hack to my gnt device, to be able to know who is
> trying to change the content of the pte, and I've got the following trace:
> 
> breakpoint() at netbsd:breakpoint+0x5
> vpanic() at netbsd:vpanic+0x1f2
> printf_nolog() at netbsd:printf_nolog
> xpq_flush_queue() at netbsd:xpq_flush_queue+0x180
> pmap_enter_ma() at netbsd:pmap_enter_ma+0x5c1
> pmap_enter() at netbsd:pmap_enter+0x35
> uvm_fault_upper_enter.clone.4() at
> netbsd:uvm_fault_upper_enter.clone.4+0x22a
> uvm_fault_internal() at netbsd:uvm_fault_internal+0x28f4
> uvm_fault_wire() at netbsd:uvm_fault_wire+0x53
> genfs_directio() at netbsd:genfs_directio+0x16a
> ffs_write() at netbsd:ffs_write+0x43a
> VOP_WRITE() at netbsd:VOP_WRITE+0x55
> vn_write() at netbsd:vn_write+0xf9
> do_filewritev() at netbsd:do_filewritev+0x1fd
> sys_pwritev() at netbsd:sys_pwritev+0x2b
> syscall() at netbsd:syscall+0x94
> --- syscall (number 290) ---
> 
> Is there anyway to prevent UVM from faulting? The address on that VA is
> already set AFAIK, but I don't know almost anything about how UVM works,
> so I would like to ask if someone could help me with that.
> 
> I'm attaching the code of the gntdev, the main function that contains
> interesting code is gntmap_grant_ref, that's where I try to get the ptes
> and set the mapping. This is not finished code, but I would like to
> understand why this page faults happen, and how can I solve this problem.

hi roger,

the problem is that UVM doesn't expect PTEs to be modified behind its back
like this.  operations that want memory to be wired (such as the O_DIRECT
I/O that you're doing above) will simulate page faults on the range
in order to update the page wired counters, and this ends up replacing
the pmap entries (ie. PTEs on x86) with what it thinks should be there,
which are the original MAP_ANON pages instead of the gntdev ones.
there's currently no way to prevent this from happening.

it would be better if we could arrange for the PTEs that are created by
the normal UVM page-fault logic would be the ones that are actually wanted.
one way of doing that would be to use a mapping of a character device
instead of a MAP_ANON mapping.  I see that you've got a new "gntdev" device
driver in your patch, you can just mmap that and have the d_mmap method
return the paddr_t values for the PTEs that you want.  the userspace side
of this would change a bit, instead of creating a memory mapping with mmap
and then replacing what that mapping points to with an ioctl, the new approach
would call the ioctl first to establish the device-offset-to-PTE mapping
inside the gntdev driver and return the new device offset to the user code,
then mmap the gntdev device with the offset returned from the ioctl.
will this approach work for the other components of this system?

if we can't (or don't want to) change the userspace part of this, then
another way to achieve this goal would be to have the ioctl that changes
the PTEs also change the UVM mapping to match, ie. to be a mapping of
the right offset of the gntdev device.  then any simulated page faults
on that mapping would do the right thing, ie. not change the PTEs
since the existing ones already have the values that UVM wants.

it would be best to abstract any code that manipulates vm_map structures
into the UVM code proper, and similarly move the code that manipulates
the pmap datastructures into pmap.c or xen_pmap.c.

-Chuck


Home | Main Index | Thread Index | Old Index