Subject: Re: Large filesystems, yet again
To: None <tech-kern@netbsd.org>
From: Christos Zoulas <christos@astron.com>
List: tech-kern
Date: 02/01/2006 19:35:40
In article <200602011858.NAA19345@Sparkle.Rodents.Montreal.QC.CA>,
der Mouse  <mouse@Rodents.Montreal.QC.CA> wrote:
>> The filesystem was 8k/64k, too, and at least one person wrote me
>> off-list saying "ISTR hearing of issues with [bsize=64k], but I have
>> not looked into it at all".
>
>I remade the filesystem with fsize=1k bsize=8k, figuring that was a
>very well-tested size combination.
>
>Then I created 418 files of 4G each, split up into five directories:
>00/0001 - 00/0099, 01/0100-01/0199, ..., 04/0400-04/0418.  (There's
>nothing special about 418; I just had it keep creating files until df
>reported at least 90% full, and that's where it happened to stop.)
>Each one has distinctive content; given a disk block belonging to any
>of them, the content lets me tell which one it belongs to and where it
>belongs within that one.
>
>Then I unmounted the filesystem and ran fsck.  No joy.  The fsck output
>is quite long and not very regular, so I don't want to quote it all
>here.  But it's got lots of "INCORRECT BLOCK COUNT"s and a number of
>inodes with "EXCESSIVE BAD BLKS" (or s/BAD/DUP/), a bunch of "PARTIALLY
>TRUNCATED INDOE"s (most of which also exhibit "EXCESSIVE DUP BLKS").
>
>Looking (with my own tools and/or dd) at what's actually on the disk,
>it appears that the errors are due to trashed indirect blocks.  I
>haven't checked them all - that would take a lot of very tedious manual
>work - but the ones I spot-checked all are.  The "INCORRECT BLOCK
>COUNT"s seem to be indirect blocks that have got filled with zeros; the
>BADs and DUPs seem to be indirect blocks filled with nonzero trash.
>
>I'm not sure where to go from here.  Would 3.0 be expected to do
>better?  (The machine is still running 2.0.)  If so, can I just drop in
>a 3.0 kernel and leave the rest of the machine running 2.0?  That would
>make it a lot more practical to test 3.0 quickly.  In particular, can I
>use the GENERIC kernel from the 3.0 install ISO, or does that one not
>have 2.0 compat in it?

Definitely worth trying. I would file a PR with all that information.

christos