Subject: Re: anoncvs problems
To: Teemu Rinta-aho <teemu@rinta-aho.org>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: current-users
Date: 02/05/2005 18:30:23
In message <42054D4E.4050003@rinta-aho.org>, Teemu Rinta-aho writes:
>Thor Lancelot Simon wrote:
>> On Sat, Feb 05, 2005 at 03:30:33PM +0200, Teemu Rinta-aho wrote:
>> 
>>>Would it require less resources for anoncvs if it was
>>>only available through cvsup/cvsync/whatever and rsync?
>> 
>> 
>> All three of the options you propose have major problems.
>> 
>> CVSup is written in Modula-3 and it is basically impossible to build a
>> native M3 toolchain on NetBSD.  We quite simply refuse to put our users
>> at risk by running precompiled binaries from third-party sources on our
>> official servers.
>> 
>> CVSync is a better option, but unfortunately its "pull" model does not
>> make it possible to do incremental update of the public server in
>> realtime.  We need to be able to re-scan just _part_ of the tree as
>> changes are committed to the master repository.
>> 
>> rsync, to be blunt, is a horrible pig.  A single copy of rsync quickly
>> explodes to tens or even hundreds of megabytes in size, and its disk
>> access patterns are arguably even worse than those of cvs.
>
>Ok, I see. Well there's then room for one more good piece of software. 
>Anyone with extra time could write an "xsync" or whatever... :-)
>

Hmm -- thinking out loud...  Suppose there was a daily job that ran 
something like 'rsync -a --delete -n' or the CVS equivalent.  That 
provides a list of changed files and deleted files, or -- with CVS -- 
a set of diff files plus a list of deleted files.  People could 
download that file, and use a fairly simple script to apply the patches 
and delete the dregs.

Of course, that's not quite what you want because you'd need a sequence 
of these, going back to the last time you pulled the delta file.  You'd 
need a local state file, then pull in a sequence of files.  

Anyway -- there are a lot of details to work out.  But the principle 
I'm trying to describe is "force the client to do the hard work".  All 
the server is doing is handing out a small set of comparatively small 
files.  Maybe....  As I said, I'm thinking out loud.

Anyone feel like writing some code?

		--Prof. Steven M. Bellovin, http://www.cs.columbia.edu/~smb