Subject: about kern/33642: panic: free dquot isn't
To: None <netbsd-bugs@netbsd.org>
From: None <lasse-list-netbsd-bugs-2005@plastictree.net>
List: netbsd-bugs
Date: 09/04/2006 18:44:27
The issue was:

| Sometimes, a sequence of commands like the following leads to a panic:
| 
| # groupadd test
| # useradd -g test -m test
| # mkdir /bulk/home/test
| # chown test.test /bulk/home/test
| 
| 
| panic: free dquot isn't
| Stopped in pid 22336.1 (chown) at netbsd:cpu_Debugger+0x4: leave
| 
| 
| Mounted partitions:
| 
| # mount
| /dev/wd0a on / type ffs (noatime, soft dependencies, local)
| /dev/raid0f on /var type ffs (noatime, soft dependencies, local)
| /dev/raid0e on /usr type ffs (noatime, soft dependencies, local)
| /dev/raid0g on /bulk type ffs (noatime, soft dependencies, local, with quotas)
| /dev/raid1e on /home type ffs (noatime, soft dependencies, local, with quotas)
| kernfs on /kern type kernfs (local)
| 
| >How-To-Repeat:
| 
| Difficult. Just repeating the above sequence does not always
| lead to a panic. Because this ought to be a production
| machine, I have not conducted extensive tests. I can do so, if
| it is required.

It was conjectured that it is the combination of softdeps and quota. I could 
not verify this - on the contrary, switching off softdeps did not make the 
problem go away. I then switched off quotas and softdeps back on. I still had 
frequent crashes with different panic messages (didn't write them down, 
sorry). 

There *were* hardware problems with this machine before, but the mainboard 
(including the SATA controller) has been replaced around the beginning of 
this year. To rule out hardware problems, I switched off both quotas and 
softdeps. The machine was stable and reached an uptime of 50 days before I 
decided to try out the 3.1_RC1 kernel. I switched on quotas and softdeps and 
did a lot of tests (blogbench, bonnie++, creating and deleting users, 
chowning files, etc.). It looked good, but a few days later, during a pkgsrc 
bulk build, the "free dquot isn't" appeared again. It seems to have happened 
closely to a useradd or something, because the /etc/passwd was gone after the 
reboot. Fortunately, this was in a chroot environment.

I consider this a very serious issue, considered the possible damage to the 
system. In view of the upcoming releases, I suggest that this thing will 
receive some attention. I offer my help on fixing this, but I do not know 
where to start. I am willing to run whatever test is necessary on my machine.

It also would be interesting to know if anyone else currently uses quotas and 
softdeps on a RAID 0 under NetBSD/i386 and what were his or here experiences.

Regards, lk.