tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Removing softdep
On 10-Jun-08, at 4:09 AM, Vincent wrote:
Sort of. Let's say there could be two levels of reliability: the
first would
still enable copy-on-write, but write block and data synchronously,
beginning by
the latter, so that what could happen at worst would be a loss of
data, but no
file corruption or exposition of sensitive data. A second level
would bypass the
copy-on-write and implement write-through, so that no data would be
lost, or a
minimal amount.
It's not quite that simple as far as I understand. I'm also not so
sure that 'mount -o sync' isn't already almost as good as you
suggest. I think the 'sync' flag on FFS only avoids the buffer cache
for writes thus reducing the amount of data loss/corruption (and
exposure) to just the last block(s) being written to the file(s) being
written to at the time of the crash. Many good safety conscious
applications already do that without avoiding the basic benefits of
the buffer cache by writing new data to temporary files and then doing
and fsync() before closing and finally renaming them into place. It's
the unix way. :-)
I think you can only prevent corruption or exposure at the FS layer if
you go one step further. You have to write all the FS metadata
carefully (i.e. in the right order such that a repair tool can clean
up any incomplete updates or inconsistencies, but you have to mark the
block list as allocated and pending, then you have to write the data
to those blocks, and finally after every block write is finished you
have to update the block list to say that the just written block is
now up-to-date and containing "valid" data. I.e. add another map, or
flags to the block list, or something such that they can be separately
allocated and then marked as valid; thus in effect replicating at the
block level what an application does by using temporary files and
fsync();rename().
That's going to be terribly slow on any mechanical rotating storage
device without a write-back cache somewhere below in the hardware
layer, and just as unreliable with a write-back cache if you can't
guarantee it will get safely flushed before the hardware is reset
somehow.
At least conceptually a journalling style of FS can give you all of
that reliability and integrity all of the time and as a bonus you get
some decent performance along with it too. A good journalling FS
shouldn't need a full fsck after any crash either.
--
Greg A. Woods; Planix, Inc.
<woods%planix.ca@localhost>
Home |
Main Index |
Thread Index |
Old Index