current-users: panics in ext2fs filesystem code (wrong inode number range)

Subject: panics in ext2fs filesystem code (wrong inode number range)
To: NetBSD Current Users <current-users@netbsd.org>
From: Barry Bouwsma <NOSPAM@Net.BSD.Linux.dk>
List: current-users
Date: 11/17/2006 12:12:54
[This message was composed some months ago, but I think is still
 relevant.  Don't reply to me as I'm almost never online now]


Hej,

I've had two different panics with an ext2fs filesystem and the
[what had been] latest -current kernel from mid-august.

The first was a readily-repeatable dup alloc panic.  When a newly-
created file can no longer fit in the cg, a quadtratic search followed
by a brute force search for a free inode is attempted.  In my case,
this resulted in inode 32 being allocated, and the next attempt to
allocate an inode again attempted to allocate inode 32.

I believe the root of the problem may be due to setting the preferred
inode to `0' when calling any of the fallback allocation strategies.
This 0 is then converted to -1.  By restoring it to 0, I'm able to
successfully exhaust all available inodes in my ext2fs filesystem
without a panic.  There may be a different reason; see below.

The next panic occurred when removing all the files created to fill
the filesystem, with a `range' error.  The NetBSD ext2fs code checks
if the inode number is >= the maximum number of inodes.  However, the
panic occurred when removing inode 2176 out of 2176.  Apparently, the
ext2fs inode numbering goes from 1 to MAX, not to MAX-1, which may
also be related to the first panic (where the previously-existing
inodes seemed to be somewhat randomly taken from various cg's rather
than being allocated to fill a cg).

Changing the range check to allow the max number of inodes resulted
in a successful deletion of all the inodes successfully created to
fill the fs.

If the ext2fs code is written based on the assumption of inodes from
0 to MAX-1 rather than from 1 to MAX, there may well be other related
bugs lurking elsewhere in the ext2fs code (possibly in fsck_ext2fs),
that I haven't taken the time to try to wrap my brain around, about,
and across.  Someone with intimate familiarity with the NetBSD ext2fs
and general UFS code should take a look at this instead.

There are no doubt a number of ways to fix the first problem --
perhaps replacing the `ipref' value of 0 with 1 in all the calls to
hashalloc for inodes.  I took the easy way out, not knowing what the
code was supposed to be doing, of correcting -1 to 0 after ipref was
adjusted by -1.


I've chosen not to send my ugly patches, in the hope that someone
with far more intimate knowledge of the ufs/ext2fs filesystems and an
overall grasp of the code (that I don't have) would better understand
the problem and come up with a correct solution.  However, here's
what I've done, for reference:
NOT A COMPLETE PATCH  @@ -441,9 +453,21 @@
        ipref--; /* to avoid a lot of (ipref -1) */
+       if (ipref == -1) ipref = 0;  /* XXX HACK */
        fs = ip->i_e2fs;
also...  @@ -554,7 +583,11 @@
        fs = pip->i_e2fs;
-       if ((u_int)ino >= fs->e2fs.e2fs_icount || (u_int)ino < EXT2_FIRSTINO)
+       if ((u_int)ino > fs->e2fs.e2fs_icount || (u_int)ino < EXT2_FIRSTINO)
                panic("ifree: range: dev = 0x%x, ino = %llu, fs = %s",

The panics can be reproduced by taking a native-Linux ext2fs
filesystem of not-too-big size, mounting rw on NetBSD-current,
and touching files until one runs out of inodes; then the
second panic is triggered by trying to remove the highest-
number inode after one successfully runs out of inodes (or
just rm everything).


Apologies if this is no longer a problem, and for not being able
to follow-up or sending this as a pr.


thanks
barry bouwsma