Re: Removing softdep

To: Vincent <10.50%free.fr@localhost>
Subject: Re: Removing softdep
From: "Greg A. Woods; Planix, Inc." <woods%planix.ca@localhost>
Date: Tue, 10 Jun 2008 09:33:40 -0400


On 10-Jun-08, at 4:09 AM, Vincent wrote:

Sort of. Let's say there could be two levels of reliability: thefirst wouldstill enable copy-on-write, but write block and data synchronously,beginning bythe latter, so that what could happen at worst would be a loss ofdata, but nofile corruption or exposition of sensitive data. A second levelwould bypass thecopy-on-write and implement write-through, so that no data would belost, or a
minimal amount.

It's not quite that simple as far as I understand. I'm also not sosure that 'mount -o sync' isn't already almost as good as yousuggest. I think the 'sync' flag on FFS only avoids the buffer cachefor writes thus reducing the amount of data loss/corruption (andexposure) to just the last block(s) being written to the file(s) beingwritten to at the time of the crash. Many good safety consciousapplications already do that without avoiding the basic benefits ofthe buffer cache by writing new data to temporary files and then doingand fsync() before closing and finally renaming them into place. It'sthe unix way. :-)

I think you can only prevent corruption or exposure at the FS layer ifyou go one step further. You have to write all the FS metadatacarefully (i.e. in the right order such that a repair tool can cleanup any incomplete updates or inconsistencies, but you have to mark theblock list as allocated and pending, then you have to write the datato those blocks, and finally after every block write is finished youhave to update the block list to say that the just written block isnow up-to-date and containing "valid" data. I.e. add another map, orflags to the block list, or something such that they can be separatelyallocated and then marked as valid; thus in effect replicating at theblock level what an application does by using temporary files andfsync();rename().

That's going to be terribly slow on any mechanical rotating storagedevice without a write-back cache somewhere below in the hardwarelayer, and just as unreliable with a write-back cache if you can'tguarantee it will get safely flushed before the hardware is resetsomehow.

At least conceptually a journalling style of FS can give you all ofthat reliability and integrity all of the time and as a bonus you getsome decent performance along with it too. A good journalling FSshouldn't need a full fsck after any crash either.


--
                                        Greg A. Woods; Planix, Inc.
                                        <woods%planix.ca@localhost>

Follow-Ups:
- Re: Removing softdep
  - From: Vincent
- Re: Removing softdep
  - From: matthew sporleder

References:
- Re: Removing softdep
  - From: Simon Burge
- Re: Removing softdep
  - From: Vincent

Prev by Date: Re: memcpy of struct buf, or similar?
Next by Date: Re: Removing softdep
Previous by Thread: Re: Removing softdep
Next by Thread: Re: Removing softdep
Indexes:

Home | Main Index | Thread Index | Old Index