Re: Possible problem with WAPBL on FFSV1

To: Brian Buhrow <buhrow%nfbcal.org@localhost>
Subject: Re: Possible problem with WAPBL on FFSV1
From: Reinoud Zandijk <reinoud%NetBSD.org@localhost>
Date: Fri, 31 Mar 2023 11:17:42 +0200

Dear Brian,

On Wed, Mar 22, 2023 at 12:21:41PM -0700, Brian Buhrow wrote:
> 	Hello.  Recently I saw a panic on two different 9.2_stable machines involving the
> filesystem.  The two machines in question are virtual machines, running under Xen, but I don't
> think that's relevant here.  While I'm not sure what the initial panic message was, since they
> were rebooted by an external monitoring script, the result was that they would continually
> panic when a specific directory was accessed.
> 
> Setup:
> Both machines are running a single FFSV1 root filesystem and one directory has over 32,000
> files in it.  This directory is continually appended and, once a day, files are purged from
> it.  When the panic occurs, the systems reboot, run their WAPBL logs, don't check the
> filesystem and, once they access the very large directory, panic again.

Can it be that the VMs were not shut down correctly at one time? Say a crash
of the qemu or a power failure of the host? VMs can write out every block
correctly but blocks then stay queued for the host to write out. The VM
declares all is written out but the floating buffers can get lost this way if
the qemu or the host fail catastrophically.

> 
> Once I brought up the VM's single user and ran fsck against the root filesystem, fsck
> complained that the very large directory contained empty blocks.  I told it to clean the
> filesystem and it advised me to run fsck against the same filesystem when I was done with the
> current run of fsck.  I did, it checked out okay, and things seem to be running again without a
> problem.

Empty blocks do seem to support my theory but maybe a FFS/WABL guru can
elaborate?

With regards,
Reinoud

References:
- Possible problem with WAPBL on FFSV1
  - From: Brian Buhrow

Prev by Date: Re: flock(2): locking against itself?
Next by Date: Re: Possible problem with WAPBL on FFSV1
Previous by Thread: Possible problem with WAPBL on FFSV1
Next by Thread: Re: Possible problem with WAPBL on FFSV1
Indexes:

Home | Main Index | Thread Index | Old Index