Subject: Re: bin/23725: possible quotacheck enhancement
To: NetBSD GNATS submissions and followups <gnats-bugs@gnats.netbsd.org>
From: Robert Elz <kre@munnari.OZ.AU>
List: netbsd-bugs
Date: 12/18/2003 20:12:25
    Date:        Wed, 17 Dec 2003 17:37:23 -0500 (EST)
    From:        "Greg A. Woods" <woods@weird.com>
    Message-ID:  <m1AWkIF-0003QZC@proven.weird.com>

  | Yes it still goes in a loop.

I kind of knew that it would ...

  | I tried fixing this but I couldn't quite get the logic right.

If it had been easy, I would fixed that when I was poking around.
Instead I thought about it for a few minutes, concluded "that's hard"
and left it alone (for now).

I think this is going to require using something other than uids for the
loop termination (generally I hate surplus variables!)

  | If I managed to get the loop to exit then I ended
  | up with a zero-byte quota file somehow.

The ftruncate() I expect (with highid == 0)

  | (I think repquota could also use a similar fix, though it does
  | eventually finish running.)

Maybe, I'll look at that one sometime.

  | There is another bug somewhere, perhaps in the checkfstab() subroutine
  | borrowed from fsck, or in the way it's used.

The fstab stuff (and borrowing code from fsck) was all added after I touched
this last (the lousy algorithm is/was my fault, but I at least have the
esxcuse that uid's were just 16 bits back when this stuff got written, and
64K times around a loop isn't all that bad...)

But I will look and see if I can see what is happening there.


  | 	/build: root     fixed: inodes 0 -> 25  blocks 0 -> 5592
  | 	/build: woods    fixed: inodes 0 -> 251329      blocks 0 -> 23006232
  | 	*** Checking user quotas for /dev/rraid1a (/mfbd)
  | 	quotacheck: creating quota file /mfbd/quota.user
  | 	/mfbd: root     fixed:  inodes 0 -> 3   blocks 0 -> 192
  | 	/mfbd: woods    fixed:  inodes 0 -> 364 blocks 0 -> 10442488
  | 	/mfbd: root     fixed:  inodes 3 -> 27  blocks 192 -> 5624
  | 	/mfbd: woods    fixed:  inodes 364 -> 251693    blocks 10442488 -> 33448720
  | 	#
  | 
  | The result is a bogus quota file.  The first values (e.g. 364 inodes,
  | 10442488 blocks for my uid) are correct.

Did you notice that 23006232+10442488 == 33448720
and that 251329+364 == 251693

I doubt that is a coincidence.

I kind of suspect a logic error in the parallelism code, along with
some nicely uninitialised variables (ie: presumed 0 variables).


  | At one point I also ended up with a quota recorded for a non-existant
  | user (not listed in the passwd db) who doesn't seem to actually own any
  | files, at least none that can be found:

That one is odd - not that the files exist, that can happen, but that
it was fixed by another quotacheck.  I don't suppose there was any activity
on those filesystems while all this was happening, aside from the quotachecks
of course.

Nothing in there should be attributing blocks/inodes to users that don't
own them, that's a little hard to see a rationale for.

  | This may also have been caused by the dueling quotacheck processes
  | though.  Turning off quotas, removing the quota.user files, and
  | re-running quotacheck, "fixed" it (even with '-q'), provided I do one
  | filesystem at a time by hand.

Ah, "turning off", missed that in my read-ahead before, that might have
an impact, the internal kernel usage records for currently open files
might be relevant here, perhaps.

  | I haven't noticed any problems with "fsck -p" running more than once on
  | any filesystem so perhaps this is only a problem in how quotacheck uses
  | checkfstab()?

Probably how it uses the results from there.

  | I've not yet tried fiddling with the fs_passno field to
  | see if making it different on each entry will help or not.

I doubt it.   I suspect a correctly positioned "exit()" in the code
might make a difference though.   I'll look at it in a few days.

kre