tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Lost file-system story

James Chacon <> writes:

> On Tue, Dec 13, 2011 at 4:09 PM, Greg A. Woods <> 
> wrote:
>> At Wed, 14 Dec 2011 09:06:23 +1030, Brett Lymn 
>> <> wrote:
>> Subject: Re: Lost file-system story
>>> On Tue, Dec 13, 2011 at 01:38:57PM +0100, Joerg Sonnenberger wrote:
>>> >
>>> > fsck is supposed to handle *all* corruptions to the file system that can
>>> > occur as part of normal file system operation in the kernel. It is doing
>>> > best effort for others. It's a bug if it doesn't do the former and a
>>> > potential missing feature for the latter.
>>> There are a lot of slips twixt cup and lip.  If you are really unlucky
>>> you can get an outage at just the wrong time that will cause the
>>> filesystem to be hosed so badly that fsck cannot recover it.  Sure, fsck
>>> can run to completion but all you have is most of your FS in lost+found
>>> which you have to be really really desperate to sort through.  I have
>>> been working with UNIX for over 20years now and I have only seen this
>>> happen once and it was with a commercial UNIX.
>> I've seen that happen more than once unfortunately.  SunOS-4 once I think.
>> I agree 100% with Joerg here though.
>> I'm pretty sure at least some of the times I've seen fsck do more damage
>> than good it was due to a kernel bug or more breaking assumptions about
>> ordered operations.
>> There have of course also been some pretty serious bugs in various fsck
>> implementations across the years and vendors.
> I'd be suspicious of fsck failing on a regularly mounted disk with
> corruption that can't otherwise be tracked to outside influences (bad
> ram, bad disk cache, etc). I've seen some bizarre things happen on ram
> errors over the years for instance.

I've got infinite sequence of nested subdirectories on new hardware and
"stable" FreeBSD 5.3 once. Something like
fsck refused to work there.


Home | Main Index | Thread Index | Old Index