Subject: Re: fsck fscked-up my filesystem
To: None <Richard.Earnshaw@buzzard.freeserve.co.uk>
From: Luke Mewburn <lukem@netbsd.org>
List: current-users
Date: 09/18/2001 18:18:02
On Thu, Sep 13, 2001 at 09:53:54PM +0100, Richard Earnshaw wrote:
> I'm not sure exactly what happend, but I've just ended up with a very
> corrupt filesystem after a crash. It could be that the old version of
> fsck that I have (-current of circa april) is incompatible with the latest
> kernels, but I'm not sure. Anybody any ideas?
>
> If I'm right, and it is related to the relative versions of the two, then
> folks need to be very careful not to run fsck manually after rebooting
> with a new kernel since old fscks will really mess your directory
> structure up.
>
> Symptoms: system crashes leaving filesystem not unmounted cleanly. Fsck
> -p runs and complains that it can't do the job, run manually; run
> manually manythings seem broken (the whole of my /usr partition ended up
> in lost+found); forcibly run fsck again and the file-system still seems
> broken (this is repeatable). Back off to old kernel; fsck fixes disks ok
> (though they are still messed up by the first fsck run).
>
> (of course, I may just have built a duff kernel somehow).
I've been trying to reproduce this problem, and I've had no luck.
I've used a "newdirpref" kernel, and both a -current fsck_ffs (matched
for the kernel), as well as fsck_ffs from 1.5.2 and a 1.5 on my
NetBSD/alpha PC164, worked "as expected" in the following
circumstances:
* cleanly unmounted file system
* file system currently mounted
* file system mounted, but "idle", and the machine rebooted
with "halt -qn"
* file system mounted, with an untar operation in action,
and the machine rebooted with "halt -qn"
* file system mounted with "softdeps", with an untar operation
in action, and the machine rebooted with "halt -qn"
In almost all circumstances, the output and behaviour was identical
between fsck_ffs from -current and from the release. There was one
minor problem that 1.5's fsck didn't find that -current's did, but
there have been some bug fixes in fsck_ffs in -current that don't
appear to have been back-ported to the 1.5 branch.
If anyone can help reproduce this problem with the "newdirpref" code,
(or any other ffs problems) that occur when a -current kernel is used
with older userland (fsck_ffs, etc), especially when softdep is NOT
being used, I'd be extremely interested.
Luke.