Subject: Re: panic
To: Frank van der Linden <frank@wins.uva.nl>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: tech-kern
Date: 03/25/1999 10:06:45
> I'm not exactly happy with this "fix".. 

I'm not particularly happy with it, either, but having a "crash your
system in one easy unprivileged read() call" isn't exactly pretty.

> it merely masks an error in the kernel and might make an application
> fail in mysterious ways.  I'd really like to see it get fixed for
> real (yes, I know that the vnode locking "protocol" is a horror).

Another approach to fixing this occurred to me:

In vn_read/vn_write, if the vnode being operated on is mmap'ed by
*anyone*, we might have this particular screw case, so
wire the user buffer pages before we lock the vnode (which will have
the side effect of paging them in if not present), and restore the
previous wired state afterwards.  

more concrete proposal:

Define a new function exported by uvm:

       uvm_uio_pageable (struct uio *uio, int pageable);

This calls uvm_map_pageable() for each iov component referenced from
uio .  (the uio contains a pointer to the appropriate user space
process, from which we can find the appropriate map).  This will
increase/decrease the wire count for each page referenced by the uio,
so if the memory is already wired, we won't unwire it.  If this fails,
cause the I/O to fail with EFAULT, since it would have already.

In vn_read/vn_write, before the vn_lock call, we check if the vnode is
mapped by looking at whether (vn->v_uvm.u_flags & UVM_VNODE_VALID) is
set (is this the right way to do this?)

If set, we set a flag, copy the uio and its iov (since uiomove will
side-effect the pointers in it), and call uvm_uio_pageable(uio,
FALSE).

After the VOP_UNLOCK, if the previously set flag is set, we undo this
with uvm_uio_pageable(uio, TRUE);

One change in application-visible behavior here is that a read() call
done on an mmap'ed file when the application "knows" the file is
shorter than the passed buffer length (and doesn't give it a
full-length buffer) will lose.  Conceivably we could work around this
by making uvm_uio_pageable work a page at a time, stopping (and
truncating its copy of the uio/iov) when it hit a bad page.

Comments?

					- Bill