Subject: Re: filesystem salvage software
To: VaX#n8 <vax@linkdead.paranoia.com>
From: CyberPeasant <listread@bedford.net>
List: netbsd-users
Date: 01/18/1998 09:43:12
> I have a drive which started flaking out under high loads only, and an
> old backup, and am wondering if anyone else has had this scenario and
> written scripts to help out.

Well, here's the wise-guy script:

	0) Backup
	1) Remove Drive
	2) Buy new drive
	3) Install new drive
	4) Restore from backup
	5) Sell old drive at an online auction, "Working pull"

> Basically I want to merge changes which occured between the backup and
> current date on the flaky drive to the copy on the new drive (restored
> from backup).  Essentially, both copies have diverged from a common
> ancestor.

You are taking votes, and I would advise a tie-breaker.  Usually
this is done with "triple redundancy", i.e. taking the two that are
the same from a group of three.

> There is also the additional fun that certain inodes on the
> flaky drive are corrupted beyond repair (fsck will not fix them).
> This leads to very bizzare files in that filesystem, with huge
> uids, gids, odd modes, and size and block numbers which don't add up.

This means that the current disk is not going to be amenable to
a normal backup.

man badsect, there's a procedure there for isolating these turkey
blocks. You might want to integrate your scripts with badsect.

> 
> I've written some PERL scripts myself, should anyone want them:
> should_copy.pl - verifies that the modes are sane, the file types
>                  are not unusual, that the (size, blocks) is possible,
>                  that uids and gids have /etc/{passwd,groups} entries,
>                  etc.
> compare_fs.pl - given a (flaky) filesystem, traverses it, performing
>                 sanity checks along the way (as above, then some),
>                 then compares to another live (new) filesystem,
>                 issuing differences to stderr and stdout.

These sound useful for a *healthy* disk.  

> There are all kinds of interesting things you could try to fix during
> such a comparison.  For example, should one worry about the inode numbers
> or change times?  What do you do if a directory has changed?  If the number
> of links to a file has changed?  Access times?  What about whiteouts?
> 
> I think I could use an explanation of file flags, and some of the
> flags and whiteouts (beyond what is in sys/stat.h).
> 
> Then, there's the data-centric stuff.  What files do you worry about
> differences in?  I run diff on some pairs of files.  What about binary
> files?  What about mail spools?  What about MH folders?  CVS repositories
> and working directories (particularly if the changes to the repository
> are lost)?

"What files should I *not* worry about?" is the better question.
How many stray setuid bits does your security model tolerate? 

If that disk has to be kept on line, first get it to pass fsck. There's
no real point in production-mounting a fs that fails fsck. Maybe there's
some low-level formatting that will help, maybe that will even fix it.
Then mount it read-only for the rest of its life, which may be short.
If you have to write to the filesystems on that disk, mount a more
reliable disk's filesystem on top of it, using mount_union, or the union
option to regular mount. Avoid ever powering off the bad disk.

Maybe all the bad sectors cluster in a certain region of the disk.
Create a dead partition that includes all those sectors.

Alas, disks don't get better, they just get worse and worse.  One cause
of these intermittant failures is that the magnetic media can separate
from the disk, and is now flying around inside the case, a little dust-storm
of random errors, waiting to land. Some disks (old ones ?) have filters
inside them, but this just puts off the inevitable.

If the disk is as bad as it sounds, what you are doing is trying to
slay the Hydra.

Disk prices are at an all-time low. Now would be a good time...

Worst case: your username suggests this disk is on a VAX, and
replacements (say a nice used RA-81) are unavailable.  Sigh.
Get a junkyard 386, a new $200 IDE drive, and export NFS to
the old gentleman. Or subsitute your preferences for 386/IDE/NFS.
(pmax/SCSI/NFS, for example).

Dave