tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: panic: bad dir: mangled entry, fsck: missing dot/dotdot



> I'm getting two "Bad file descriptor" errors, one on a directory and
> another on a regular file, both in the same directory.  What do you
> suggest to do?

Hm.

What do you get those errors from?  find(1)?  I think the first thing
I'd try to do is provoke them deliberately by hand - eg, try using find
on a directory one or two levels up rather than the whole filesystem,
if possible - and try to capture the error with ktrace.  I'm wondering
what syscall is producing the error.

You say it's an FFSv2 filesystem; I don't know much about how FFSv2
differs from FFSv1, and it's v1 I know well.  So the rest of this will
be written for v1, in the hope it's similar enough to v2 for the
remarks to be useful.

I'd have a close look at the containing directory.  In particular, I'd
make sure the d_type values match the types of the pointed-to inodes.
(It wouldn't surprise me if two entries with different d_type values
but the same inumber could produce surprising results, for example.)
I'd also have an intensive look at the entries which produce errors,
and at the inodes named by them.  If you just want to repair it, rather
than figuring out what's going on, and you can afford to lose what's in
the file, I'd suggest clri on all three inodes (containing directory,
file, and contained directory), then fsck and fishing things out of
lost+found to clean up the damage.  (If any two of those inodes are the
same, something is definitely corrupt.)

I can't really give full instructions, since this would be an
exploratory sort of investigation, with most of it guided by what
earlier work found.

> I'll have a look at that.  I do have photographs of the fsck dealing
> with the first dot-lacking directory.

That would be interesting, though I'm not sure how informative it would
be; the thing of real interest is something fsck missed and thus
probably won't be mentioned in fsck's output.

>> meaning each one held whatever was last written to either of them,
>> something filesystems do not deal with well.
> With all due respect to FFS's stability, I would expect more havoc if
> that were the case.

Depends on what the blocks get used for.  If, for example, each of them
happened to be a block of inodes, nothing will happen until inodes that
happen to end up on the same piece of disk get used - and the defaults,
in my experience, provide _way_ more inodes than needed, so that could
be a rare event, and will strike only a handful of inodes in any case.
If they happen to be data blocks, nothing will happen except that file
contents will get corrupted.  It's when they're indirect blocks or
superblocks, or one's inodes and the other isn't, that I'd expect
serious havoc.

> Fortunately, the components are SAS discs.

>> reading quirk lists makes me think such a thing is depressingly
>> plausible.
> Even on SAS?

Well, I wouldn't totally rule it out - at this point there's very
little I'd totally rule out - but I'd definitely investigate other
possibilities first.

If you have the space - which you may well not, given how large the
filesystem is - I'd suggest capturing a snapshot of the filesystem for
investigation, then look at the live copy with an eye to optimizing for
time-to-repair, with understanding deferred to later investigations on
the copy.

/~\ The ASCII                             Mouse
\ / Ribbon Campaign
 X  Against HTML                mouse%rodents-montreal.org@localhost
/ \ Email!           7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Home | Main Index | Thread Index | Old Index