Subject: Re: fsck and soft updates (soft dependencies)
To: NetBSD Users's Discussion List <netbsd-users@netbsd.org>
From: Chuck Swiger <cswiger@mac.com>
List: netbsd-users
Date: 09/05/2006 10:19:43
On Sep 4, 2006, at 8:53 PM, Greg A. Woods wrote:
>> For the most part, filesystem snapshots have worked pretty well under
>> a bunch of cases on FreeBSD; the example of using dump/restore is a
>> good one, also for trying to obtain a more consistent file-level
>> backup of an active database, and so forth.
>
> Unless you've done something to integrate some kind of consistency
> flush/halt feature into your database software there's no guarantee  
> of a
> consistent file or file-system level backup of any active database,
> especially not if you're talking about anything more sophisticated  
> than
> a flat text files maintained with awk and join et al.

You are right that there is no guarantee; however, using snapshots  
means that you'll get a copy of the database with all OS-level  
buffers flushed to disk, and most "real" databases take some care  
that what they scribble to disk can be made consistent by rolling  
backwards with the transaction logs until you've removed any pending  
or incomplete transactions from the time of the snapshot.

The purpose of transaction logging or a "majority rules" algorithm  
for resolving whether an update has been applied was originally to  
handle the case where the system went down completely (due to power  
failure or severe software error) and get the DB back to a consistent  
state with minimal loss, but these mechanisms do not work as nearly  
as well if the database files are archived at different times (ie, as  
tar or dump proceeds sequentially through the directory hierarchy);  
using a snapshot, you can archive all of the database files as of a  
specific instant, which significantly improves the consistency and  
recoverability of a live database dump.

> You must still stop your database completely to ensure that any  
> data still cached in in
> any process is flushed and the file-level data is 100% consistent.

Agreed.  I used the phrase "more consistent" rather than "100%  
consistent".

-- 
-Chuck