tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ffs snapshots patch



On Mon, Apr 18, 2011 at 09:36:25AM +0200, Juergen Hannken-Illjes wrote:
> [...]
> > Fixing 2) is trickier. To avoid the heavy writes to the snapshot file
> > with the fs suspended, the snapshot appears with its real lenght and
> > blocks at the time of creation, but is marked invalid (only the
> > inode block needs to be copied, and this can be done before suspending
> > the fs). Now BLK_SNAP should never be seen as a block number, and we skip
> > ffs_copyonwrite() if the write is to a snapshot inode.
> 
> I strongly object here.  There are good reasons to expunge old snapshots.
> 
> Even it it were done right, without deadlocks and locking-against-self,
> the resulting snapshot looses at least two properties:
> 
> - A snapshot is considered stable.  Whenever you read a block you get
>   the same contents.  Allowing old snapshots to exist but not running
>   copy-on-write means these blocks will change their contents.
> 
> - A snapshot will fsck clean.  It is impossible to change fsck_ffs
>   to check a snapshot as these old snapshots indirect blocks now will
>   contain garbage.

Maybe we should relax these contraints then

> 
> You cannot copy blocks before suspension without rewriting them once
> the file system is suspended.
> 
> The check in ffs_copyonwrite() will only work as long as the old
> snapshot exists.  As sson as it gets removed we will run COW
> on the blocks used by the old snapshot.

is it a problem ?

On Sat, Apr 23, 2011 at 10:40:05AM +0200, Juergen Hannken-Illjes wrote:
> [...]
> 
> These times depend on the file systems block size.  With contiguous indirect
> blocks (ffs_balloc.c rev 1.54) I did timings on a 1.4 TByte UFS1 non-logging
> file system created on 3 concatenated WD5003ABYX.  For every block size
> I created four persistent snapshots (with unmounting the file systems after
> every creation) and get these times (seconds):
> 
> Layout            create suspended
> 
> 91441948 x 16384  385.713   22.785
> 91441948 x 16384  414.170   59.580
> 91441948 x 16384  474.164   91.385
> 91441948 x 16384  652.556  111.314
> 
> 45720974 x 32768   43.478    0.420
> 45720974 x 32768   40.790    5.642
> 45720974 x 32768   49.700   12.748
> 45720974 x 32768   55.599   18.612
> 
> 22860487 x 65536    7.005    0.600
> 22860487 x 65536   10.558    2.436
> 22860487 x 65536   14.365    4.122
> 22860487 x 65536   18.615    5.739
> 
> For me snapshots create reasonable fast with a block size of 32k or 64k.

On my test system (16k/2k UFS2, logging, quotas) I get:
/usr/bin/time fssconfig fss0 /home /home/snaps/snap0
      141.69 real         0.00 user         1.22 sys
/home: suspended 14.716 sec, redo 0 of 2556
/usr/bin/time fssconfig fss1 /home /home/snaps/snap1
      213.87 real         0.00 user         1.98 sys
/home: suspended 64.027 sec, redo 0 of 2556
/usr/bin/time fssconfig fss2 /home /home/snaps/snap2
      290.82 real         0.00 user         3.06 sys
/home: suspended 120.641 sec, redo 0 of 2556
/usr/bin/time fssconfig fss3 /home /home/snaps/snap3
      342.11 real         0.00 user         3.92 sys
/home: suspended 170.733 sec, redo 0 of 2556

Even a 14s hang is still a long time for a NFS server (workstations will be
frozen by this time). Even if we can make it shorter with some filesystem
tuning, it still doesn't scale with the size of the filesystem and
the number of snapshot (having 12 persistent snapshots on a filesystem is
not a unreasonable number).
Other OSes can do it with almost no freeze, so it should be possible
(the snapshot may not be fsck-able, but I'm not sure it's the most
important property of FS snapshots).

-- 
Manuel Bouyer <bouyer%antioche.eu.org@localhost>
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index