Subject: possible resize_ffs/fsck trouble
To: None <tech-userlevel@netbsd.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-userlevel
Date: 07/06/2004 02:26:16
I just had an unpleasant experience which leads me to suspect that
resize_ffs has exposed yet another bug in fsck and/or the kernel.

I had a filesystem of some 6G, less than 1/3 full, which I shrank down
to about 2G with resize_ffs (or more precisely, my version, which based
on a brief look at the source looks algorithmically the same as the
NetBSD one).  Everything seemed fine.

But then - this filesystem included a built source tree.  I tried to
remove all the built files.  Bam, panic "blkfree: bad size".  Reboot,
let fsck fix, try again.  Same panic.  After the third one, I moved the
drive to another system, copied the data off, rebuilt the filesystem,
and copied it back on.

Thus, as with the other problems with resize_ffs, there is definitely a
bug in at least one of fsck and the kernel, because fsck was happy with
the filesystem.  This is with a relatively old system, but a little
looking at code leads me to suspect problems are still present.

I've had a quick look at the test that controls that panic and I have
some suspicions, but I haven't yet tried to verify any of them.  This
note is just a heads-up that yet another way has been found in which a
filesystem can be OK per fsck and still panic the kernel when used -
and that a relatively sane use of resize_ffs can produce such.

If anyone want to look at this, I expect I can re-cause it and produce
a filesystem image.  Drop me a line off-list if you want me to.

Maybe resize_ffs should be turned into a regression test for fsck?
(Once the bugs it's exposed are fixed, of course.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B