Subject: recovering from a bad crash, ffs recovery
To: None <netbsd-users@netbsd.org>
From: Charles Shannon Hendrix <shannon@widomaker.com>
List: netbsd-users
Date: 01/04/2004 19:09:58
Last night at 2am my system crashed.

It was a malloc failure in the kernel map.  This happens every once in
awhile, and has been since the later 1.5 kernels.  This was a 1.6.1
kernel running on a Sun SPARCstation 5.

Usually, this happens under some kind of high load.  It's also usually
mostly harmless.  I've reported the problem before, but was just told
that it is "going to be fixed" at some point.

This time around, it has been more than annoying.  One of my drives is
completely empty, like someone ran newfs on it.  Nothing in the logs to
indicate a problem.

I unmounted the drive, and it passes all its diagnostics, etc.  I then
piped the drive through strings and I can see all of my old directory
entires, data from the files, etc.  The entire drive is readable, and
full of my files.

I would really like to get my filesystem back.  I've changed its
/etc/fstab entry to be a read-only mount, and have been searching for
some tips, but don't see much.  The drive hasn't been written to since
the reboot.

I'm in the situation of not being able to afford a backup system which
actually works, so there are files I don't have archived.  I've tried
reading the latest on 4mm tape, and naturally, it isn't readable.

Right now, I'm not sure what happened, except the kernel on first
booting was confused about the drive layout.  However, that shouldn't
have caused any overwriting of the filesystem.  My Sun's PROM doesn't
seem to like me telling it to boot rnetbsd, the kernel I normally use,
and I believe it tried to boot netbsd instead.

The only difference is that rnetbsd (run kernel) is 1.6.0, and the
netbsd kernel is 1.6.1.  The old kernel probably has drive order
hard-coded.

None of the slices for things like swap space have the same slice as
the filesystem that's gone missing, so all that should have happened is
things got mounted in the wrong place.

The system on that boot reported fsck problems with the root drive
(which is fine), so I told it to restart with the rnetbsd kernel, and
that boot was fine.

Well, except for the missing files!

Any tips appreciated.  

Free DLT tape drives and media appreciated too... :)




-- 
UNIX/Perl/C/Pizza____________________s h a n n o n@wido !SPAM maker.com