Subject: filesystem salvage software
To: None <netbsd-users@NetBSD.ORG>
From: VaX#n8 <vax@linkdead.paranoia.com>
List: netbsd-users
Date: 01/17/1998 21:55:43
I have a drive which started flaking out under high loads only, and an
old backup, and am wondering if anyone else has had this scenario and
written scripts to help out.

Basically I want to merge changes which occured between the backup and
current date on the flaky drive to the copy on the new drive (restored
from backup).  Essentially, both copies have diverged from a common
ancestor.

There is also the additional fun that certain inodes on the
flaky drive are corrupted beyond repair (fsck will not fix them).
This leads to very bizzare files in that filesystem, with huge
uids, gids, odd modes, and size and block numbers which don't add up.

I've written some PERL scripts myself, should anyone want them:
should_copy.pl - verifies that the modes are sane, the file types
                 are not unusual, that the (size, blocks) is possible,
                 that uids and gids have /etc/{passwd,groups} entries,
                 etc.
compare_fs.pl - given a (flaky) filesystem, traverses it, performing
                sanity checks along the way (as above, then some),
                then compares to another live (new) filesystem,
                issuing differences to stderr and stdout.

There are all kinds of interesting things you could try to fix during
such a comparison.  For example, should one worry about the inode numbers
or change times?  What do you do if a directory has changed?  If the number
of links to a file has changed?  Access times?  What about whiteouts?

I think I could use an explanation of file flags, and some of the
flags and whiteouts (beyond what is in sys/stat.h).

Then, there's the data-centric stuff.  What files do you worry about
differences in?  I run diff on some pairs of files.  What about binary
files?  What about mail spools?  What about MH folders?  CVS repositories
and working directories (particularly if the changes to the repository
are lost)?