Subject: Re: uvm & flushing
To: Bill Studenmund <wrstuden@nas.nasa.gov>
From: Chuck Silvers <chuq@chuq.com>
List: tech-kern
Date: 12/09/1999 00:22:26
On Wed, Dec 08, 1999 at 11:32:52AM -0800, Bill Studenmund wrote:
> On Tue, 7 Dec 1999, Chuck Silvers wrote:
> > presumably, once you start the archiving process, you'd prevent the
> > creation of new dirty pages so that your archive copy remains valid.
> > if there are any dirty pages when you're about to mark the file
> > non-resident, those pages would not be in the archived version,
> > so flushing them to disk at this point wouldn't help much.
> > (I'm making some guesses about how your system works, so if what
> > I'm saying doesn't make any sense then I've no doubt guessed wrong.)
> 
> You've hit the important points, but our system's a little different.
> 
> We can't really prevent the creation of new dirty pages while archiving,
> for two reasons. First off, I'm not sure how to make uvm do that. :-)
> 
> Second off (and more important), we are trying to preserve as much of the
> normal user experience as possible. So if you see your file, and you want
> to go edit it, you can. Even if we're in the process of archiving it.

ok, so you're trying to detect the creation of dirty pages rather than
prevent it.


> We already have logic in the code so that a write to a file being archived
> will make the archive fail (admittedly at the end of the archive, but
> that's not bad). This step is easy as all the writes come through our
> layer and we just update the bits after a successful write.
> 
> It's not so easy with uvm. Our layer wouldn't see writes until the page
> daemon came along. :-) What should I do here? I am mainly interested in
> knowing if our vnode has dirty pages before marking it archived. How can I
> get this info?

unfortunately there's not a really good answer for this now.  the best thing
I can think of is to synchronously flush all pages before starting the archive
and remember the mtime.  then when the archive is done, do the flush again
and see if the mtime is different.  if so, then some pages were dirty and
the archive copy is invalid. 

the annoying thing here is that clean pages can become dirty immediately
after you check for them, and what's worse is that the filesystem layer
doesn't find out about this until somebody flushes the pages.  so if you
allow VOP_READ()s to occur, those reads could be from the vm system
creating a page cache page, which could immediately become dirty.

it seems that the only way to safely get into the "file is not resident"
state is to play the flush-and-check-mtime game and then mark the file
not-resident while holding the vnode lock the whole time.

(also note that this will change with UBC, but I think it will be
somewhat less gross than what I've described above.  instead of the
above game, you'll need to flush+invalidate before starting the archive,
and in your getpages routine you'll mark the archive invalid when you
get write faults.  for read faults, you'll need to mark the pages you
get back from the lower layer as PG_RDONLY so that subsequent writes
to those pages will fault again.  this way you'll know immediately
when the archive becomes invalid, and checking whether it's safe to
go from "archived" to "not-resident" is just checking a flag.)

-Chuck