NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/58553: ffs: garbage data appended after crash
The following reply was made to PR kern/58553; it has been noted by GNATS.
From: Robert Elz <kre%munnari.OZ.AU@localhost>
To: campbell+netbsd%mumble.net@localhost
Cc: gnats-bugs%netbsd.org@localhost
Subject: Re: kern/58553: ffs: garbage data appended after crash
Date: Mon, 05 Aug 2024 01:51:02 +0700
Date: Sun, 4 Aug 2024 15:30:01 +0000 (UTC)
From: campbell+netbsd%mumble.net@localhost
Message-ID: <20240804153001.C637C1A923F%mollari.NetBSD.org@localhost>
| 1. start a write-heavy workload
That's not necessarily needed ... I've seen cases where this kind of
thing happens on an almost idle system, where metadata updates were
all done, but data hadn't been written to files when the system crashed
(sudden complete power loss I think it was) about 12 hours after the
data had been written. The data writes depend upon something in the
system bothering to do them, and while if you have a write-heavy workload
that's likely to not take too long, if you don't, it can sometimes be
a very long time.
In my case I could easily tell as the data that was "lost" (not really,
I had copies) was incoming e-mail - the mail files all looked to be there,
had appropriate modify times, sizes, etc, but garbage contents.
[Since then I have my own replacement for update(8) running all the time!]
| >Fix:
I doubt that can be called a fix. A hack which might work around some
of the issues - perhaps the most common case - but not a fix.
Two major issues I can see .. first, nothing in your proposal covers
the case of data overwrites, where the metadata (other than the mtime)
isn't being altered at all, but several blocks of data are being written
somewhere in the middle of a file - some of those might be written, and
others not, leading to garbage in the file which is neither its before
nor intended after state. Your "at the end" case is just the common
case of that, but to be considered a fix, all of it would need fixing.
And:
| 3. Change ffs_fsync and ffs_full_fsync so that if they are syncing any
| prefix of the interval [k0, k1],
And if not syncing a prefix - but some data in the middle? Easy to just
not update things in that case, but sometime later, when the earlier part
of the interval has been written, the record would need to grow all of these
other blocks, as they won't happen again. The typical solution to that is
to split the record into two on any write to a segment in the interval, one
for what is still to come before, and one for what comes after, omitting
either, or both, of those if empty. In hard cases that can deteriorate
into a real mess.
However:
| (We can also use truncate(n,k) records to make truncate itself atomic
that one probably would be a benefit, though whether it is sufficiently
useful to add this extra mechanism, and forgo backward compat, I doubt.
After all, everything needed to finish a truncate is in the metadata, if
the size says the file should be 100 bytes, and there are blocks allocated
beyond that, those can easily be removed during file system cleanup, after
the crash. That is, we can deduce what was happening from the state
that remains.
This is all much much harder than it looks. If we really believe some
kind of better method is needed, we should probably bribe Kirk to come
and make softdeps work in NetBSD. Not that even that is a full solution,
data corruption after a crash is extremely hard to avoid without doing
fully synchronous (all the way to the flash or platter) I/O - which is
not something most people would tolerate most of the time.
kre
Home |
Main Index |
Thread Index |
Old Index