Subject: Re: mmap
To: Jonathan Stone <jonathan@dsg.stanford.edu>
From: Chuck Cranor <chuck@maria.wustl.edu>
List: current-users
Date: 11/14/1998 22:36:03
hi-
>The first failure ("shared map does not match within page") is:
> open file,
> mmap file as shared,
> write(2) to mapped backing file,
> do msync(MS_INVALIDATE),
> read addresses in mapped file corresponding to write(2)
>The error is that after the msync(), the contents of the MAP_SHARED
>region are not the same as the mapped (backing) file. the msync()
>should guarantee that they are.
>I think that's a bug in UVM. Chuck?
yes. its due to me trying to copy the semantics of the BSD VM system
calls. it turns out the BSD VM system's msync has a bug too (but it
comes out in a different way). here's the story:
the msync system call [sys___msync13() in uvm_mmap.c] calls
uvm_map_clean() [from uvm_map.c] to clean out the map. if
you look at uvm_map_clean in rev 1.32 of uvm_map.c, note the
following code:
/*
* flush pages if writing is allowed. note that object is
* locked.
* XXX should we continue on an error?
*/
if (object && object->pgops &&
(current->protection & VM_PROT_WRITE) != 0) {
if (!object->pgops->pgo_flush(object, offset,
offset+size, flags)) {
what that says is if the mapping has a valid backing object
and the user process has write access to the mapping then call
the pgo_flush pager operation to flush out the old data.
the problem here is that in the test program the memory mapping
is establish by mmap(2) with PROT_READ (read-only), thus
(current->protection & VM_PROT_WRITE) == 0.
after thinking about it, it seems to me that that permission check
doesn't really belong there, so i removed it and that fixed the
problem.
this bug got me wondering why i put the permission check in there
in the first place. i started looking at BSD VM, and discovered
this in vm_map_clean() in vm_map.c:
/*
* Flush pages if writing is allowed.
* XXX should we continue on an error?
*/
if ((current->protection & VM_PROT_WRITE) &&
!vm_object_page_clean(object, offset, offset+size,
syncio, FALSE)) {
vm_object_unlock(object);
vm_map_unlock_read(map);
return(KERN_FAILURE);
}
if (invalidate)
vm_object_page_remove(object, offset, offset+size);
[note that vm_object_page_clean does what uvm's pgo_flush does]
that explains where i got the check from (i was attempting to
make sure the UVM system calls had the same semantics as BSD VM).
however, if you look at the code quoted above, you will note that
it has a major bug in it! the invalidate check is outside of the
protection check. this allows BSD VM to throw away valid modified
data without saving it to backing store... thus you can easily get
[or even trigger] data loss.
consider the following case:
you've got two users: user1 and user2. suppose the following
sequence of events happens:
user1 user2
- create a file, mode 644
- write random data to the file
- mmap the file in, shared
- modify shared mapping
(thus overwriting the
previous data)
- open user1's file O_RDONLY (to
avoid EPERM)
- mmap the file PROT_READ, SHARED
- call msync(2) with MS_INVALIDATE
set
now at the point where user2 calls msync(2), the file has the
orig data in the buffer cache, but it has the modified data in
the VM system's vm_object (recall that we have separate VM and
buffer cache). when the quoted section of BSD VM's vm_map_clean
runs, the vm_object_page_clean() call will _not_ happen because
user2 does not have write access to the mapping. however, the
vm_object_page_remove() _will_ happen and this will caused the
valid data in the VM system's vm_object to get discarded
incorrectly.
i verified this problem by running a couple of test programs under
BSD VM.
so for UVM, what i'm going to do is remove the unnecessary permission
check to fix the bug.
i suppose the thing to do for BSD VM is to remove the permission check
there also?
chuck