Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Filesystem corruption in current 9.99.92 (posix1eacl & log enabled FFSv2)



Hello,

for tracking down an FFS issue in current I would appreciate some advice. There is a NetBSD 9.99.92 Xen/PV VM (storage provided by file backed VND). The kernel is built from ~2012-11-27 CVS source. The root partition is a normal FFSv2 with WAPBL. In addition there is a data partition for which I have posix1eacls enabled (for samba network shares and sysvol).

The data partition causes problems. Without the host being crashed or rudely shut down in the past, the filesystem seems to have become inconsistent. I first noticed this because the "find" of the daily cron job was still running late in the morning with 100% CPU load but no disk I/O ongoing.

Then I took the filesystem offline for safety and forced a fsck. Errors were detected and solved:

```
$ doas fsck -f NAME=export
** /dev/rdk3
** File system is already clean
** Last Mounted on /export
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
CG 31: PASS5: BAD MAGIC NUMBER
ALTERNATE SUPERBLK(S) ARE INCORRECT
SALVAGE? [yn]

CG 31: PASS5: BAD MAGIC NUMBER
ALTERNATE SUPERBLK(S) ARE INCORRECT
SALVAGE? [yn] y

SUMMARY INFORMATION BAD
SALVAGE? [yn] y

BLK(S) MISSING IN BIT MAPS
SALVAGE? [yn] y

CG 799: PASS5: BAD MAGIC NUMBER
CG 801: PASS5: BAD MAGIC NUMBER
CG 806: PASS5: BAD MAGIC NUMBER
CG 823: PASS5: BAD MAGIC NUMBER
CG 962: PASS5: BAD MAGIC NUMBER
CG 966: PASS5: BAD MAGIC NUMBER
482470 files, 113827090 used, 67860178 free (3818 frags, 8482045 blocks, 0.0% fragmentation)

***** FILE SYSTEM WAS MODIFIED *****
```

I did not find too much information what this magic numbers of a cylinder group means and what could have caused them to be "bad" :-/ Anyway, a repeated fsck does not show further errors so I thought it should be fine. However, after mounting the FS to /export with

```
$ find /export
```

i can still trigger the above mentioned 100% CPU problem in a reproduce-able manner. Thereby find always hangs at the same directory entry.

Does anyone have an idea how I can investigate this further? I have already done a ktrace on find, but in the state in question there seems to be no activity going on in find itself.

Kind regards
Matthias


Home | Main Index | Thread Index | Old Index