tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Possible problem with WAPBL on FFSV1



Dear Brian,

On Wed, Mar 22, 2023 at 12:21:41PM -0700, Brian Buhrow wrote:
> 	Hello.  Recently I saw a panic on two different 9.2_stable machines involving the
> filesystem.  The two machines in question are virtual machines, running under Xen, but I don't
> think that's relevant here.  While I'm not sure what the initial panic message was, since they
> were rebooted by an external monitoring script, the result was that they would continually
> panic when a specific directory was accessed.
> 
> Setup:
> Both machines are running a single FFSV1 root filesystem and one directory has over 32,000
> files in it.  This directory is continually appended and, once a day, files are purged from
> it.  When the panic occurs, the systems reboot, run their WAPBL logs, don't check the
> filesystem and, once they access the very large directory, panic again.

Can it be that the VMs were not shut down correctly at one time? Say a crash
of the qemu or a power failure of the host? VMs can write out every block
correctly but blocks then stay queued for the host to write out. The VM
declares all is written out but the floating buffers can get lost this way if
the qemu or the host fail catastrophically.

> 
> Once I brought up the VM's single user and ran fsck against the root filesystem, fsck
> complained that the very large directory contained empty blocks.  I told it to clean the
> filesystem and it advised me to run fsck against the same filesystem when I was done with the
> current run of fsck.  I did, it checked out okay, and things seem to be running again without a
> problem.

Empty blocks do seem to support my theory but maybe a FFS/WABL guru can
elaborate?

With regards,
Reinoud



Home | Main Index | Thread Index | Old Index