Subject: Re: dump and nodump flag
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Brian C. Grayson <bgrayson@marvin.ece.utexas.edu>
List: tech-kern
Date: 03/03/1999 22:56:31
On Wed, Mar 03, 1999 at 05:00:12PM +0100, Manuel Bouyer wrote:
> The directory tree is already walked in pass 2. Adding a pass to propagate
> nodump seems just redontant to me. And the code you added as a lot of
> redoundant functionalities with the one used by pass 2. It's when trying
> to merge these code I would that an extra pass was not needed.

  Hm.  Is there any reason we can't do pass 1 and 2 (including
the nodump stuff) at the same time?  Correct me if I'm wrong on
any of this:

Pass 1:  Find all inodes that need to be dumped (modified since
    last dump).

Pass 2:  Look at each directory, and figure out if it contains
    any files flagged in pass 1.

  Couldn't we just use a recursive traversal of the (possibly
unmounted) FS, and do it all, including checking for nodump
flags, all in one pass?

Pseudo-code:
DumpPass1And2(inode) {  /*  DumpPass1And2 returns the size
			    estimate for this and any children inodes.  */
  if (inode is a directory) {
    if not dumpable due to nodump flag, return 0;
    otherwise, call DumpPass1And2 for each child and add up return values.
    if (return_sum == 0) return 0;  /*  We don't need to be dumped either.  */
    set bit for inode.
    set bit for directory.
    return return_sum+estimate for myself;
  }
  /* else, it's a file.  */
  if not dumpable, return 0;
  set bit for inode
  return estimate;
}

/*  The tape estimate is easy to calculate.  :)  */
tape_estimate = DumpPass1And2(rootino);

  In fact, a traversal, with the new nodump mods, would
actually perform fewer disk accesses, as the inodes for files
under a nodump directory would never be examined.  It would not
necessarily walk over the inodes in order, though, so maybe
it's a wash from the performance point of view.  But I think it
would be much cleaner code!  And many fewer lines, too!

  One other advantage to going to a cleanly-written
traversal-style method is, it would be nearly trivial to write
dump_ext2fs and dump_lfs (not to be confused with the existing
dumplfs, which is really more like lfsdb), and maybe even
dump_nfs, dump_msdos, dump_mfs, dump_ados.

  Not for 1.4, of course...

> I looked at this, but it had a problem: if /tmp is MFS, you loose.
> Also, I found the size of the dump depends on how the filesystem is
> setup and how many files are on it. I changed this to use a fresh
> filesystem on a vnd device, which give more reliable results.

  Good idea!  My scripts accepted a margin of error, but it
wouldn't have been enough for all possible FFS systems.  :)

  Brian