Subject: rsync numbers
To: None <current-users@netbsd.org>
From: Andrew Cagney <ac131313@cygnus.com>
List: current-users
Date: 11/11/2000 15:19:06
This is just fo follow up a suggestion I made for an rsync server that
it could be used to sync unpacked distributions (and hence upgrade a
system without the need to download everything).

Below are two experiments to directly test the theory.  In the
experiments I'm using uncompressed numbers.  Normally you would also
compress all incomming and out-going data streams.



Experiment 1: rsync an unpacked distribution.

A quick poke at /usr appears to suggest:

  # rsync --archive --delete --dry-run
/home/scratch/tmp/usr/{bin,lib.X11R6,games,libexec,sbin}  /usr
  ...
  wrote 12950 bytes  read 2528 bytes  4422.29 bytes/sec
  total size is 27888642  speedup is 1801.82

but I don't believe this :-)  Lets look at one directory, /usr/bin, in
more detail.

Using two directories dated:

  -r-xr-xr-x  3 root  wheel    63688 Jul 13 12:14 Mail

and:

  -r-xr-xr-x  3 root  wheel    63688 Nov  6 22:57 Mail

Rsync --dry-run gives:

  # rsync --archive --delete --dry-run /tmp/beta/bin /tmp/alpha
  ...
  wrote 6614 bytes  read 1408 bytes  5348.00 bytes/sec
  total size is 13016190  speedup is 1622.56

while the real transfer results in:

  # rsync --archive --delete --verbose /tmp/beta/bin /tmp/alpha
  ...
  wrote 6714886 bytes  read 113842 bytes  130071.01 bytes/sec
  total size is 13016190  speedup is 1.91

so it is giving a ~50% improvement (but rsync --dry-run gives misleading
numbers) :-(



Experiment 2: rsync the uncompressed tarballs.

Since the files involved are larger, the probability of finding common
data blocks is increased slightly.  It also gives a pretty good
indication of the likely efficiency of rsyncing an entire unpacked
distribution. The files are uncompressed as rsyncing compressed data is
pretty much useless.

alpha:
  total 355424
  37222400 Nov 11 14:59 base.tar
  44072960 Nov 11 15:00 comp.tar
    563200 Nov 11 15:00 etc.tar
   6809600 Nov 11 15:00 games.tar
  19957760 Nov 11 15:00 man.tar
  10045440 Nov 11 15:00 misc.tar
   4208640 Nov 11 15:00 text.tar
   7884800 Nov 11 15:00 xbase.tar
   7843840 Nov 11 15:00 xcomp.tar
    542720 Nov 11 15:00 xcontrib.tar
   7475200 Nov 11 15:00 xfont.tar
  35112960 Nov 11 15:00 xserver.tar

beta:
  total 364996
  38574080 Nov 11 15:00 base.tar
  46161920 Nov 11 15:00 comp.tar
    614400 Nov 11 15:00 etc.tar
   6952960 Nov 11 15:00 games.tar
  23214080 Nov 11 15:00 man.tar
   7864320 Nov 11 15:00 misc.tar
   4208640 Nov 11 15:00 text.tar
   7895040 Nov 11 15:00 xbase.tar
   7854080 Nov 11 15:00 xcomp.tar
    542720 Nov 11 15:00 xcontrib.tar
   7475200 Nov 11 15:00 xfont.tar
     51200 Nov 11 15:00 xmisc.tar
  35256320 Nov 11 15:01 xserver.tar

(xmisc.tar is a new file).  And the numbers:

wrote 109286198 bytes  read 589148 bytes  693219.85 bytes/sec
total size is 186664960  speedup is 1.70



So what can be concluded?  Inconclusive really.  There is definitly a
significant saving however it does come at come cost of other resources
(namely CPU time).  If you're stuck at the wrong end of a 28k modem you
may want to persue this.

	enjoy,
		Andrew