Re: Automated report: NetBSD-current/i386 build failure

To: Andreas Gustafsson <gson%gson.org@localhost>
Subject: Re: Automated report: NetBSD-current/i386 build failure
From: "Greg A. Woods" <woods%planix.ca@localhost>
Date: Sat, 12 Nov 2011 15:04:20 -0800

At Sat, 12 Nov 2011 12:49:29 +0200, Andreas Gustafsson 
<gson%gson.org@localhost> wrote:
Subject: Re: Automated report: NetBSD-current/i386 build failure
> 
> Thor Lancelot Simon wrote:
> > On Fri, Nov 11, 2011 at 11:33:10PM +0200, Andreas Gustafsson wrote:
> > > 
> > > Your assumption is correct, and the next build did succeed.  Sorry about
> > > the false alarm.  If only there was a way to mirror the repository that
> > > would guarantee a consistent snapshot...
> > 
> > Not really possible; CVS does not have changelists.  You can't do any
> > better than it does for the developers themselves!

Indeed, CVS can do as well as the average developer might normally do
(and NetBSD developers seem to be well above average at doing this).  It
doesn't really need these so-called "change lists" to be as deeply
integrated as people often seem to believe they should be.

While the files in the repository don't have any level of "direct"
atomicity with respect to "simultaneous" commits to multiple files,
there is a "view" available which can show the transaction points quite
reliably, and that is through the same mechanism which generates the
source-changes e-mails.  I.e. that same mechanism which generates the
commit reports reliably identifies change sets to the precise same
degree that developers are able to "mark" them by using one "cvs commit"
command to commit all related changes in all related files at once.  As
I say, NetBSD developers do seem to be, on the whole, quite well
disciplined at using CVS in this way.

> So whan I'm talking about getting a consistent snapshot of the
> repository, I'm really talking about read atomicity only.  I'm willing
> to take the small risk of snapshotting the repository while it is in
> the middle of a commit, but I'd like to eliminate the much greater
> risk of a commit happening in the middle of mirroring.  There is no
> fundamental reason why this couldn't be done despite the broken
> semantics of CVS, but that still doesn't mean it's easy...

I've been running periodic rsync's (every 6 hours) of the repository
since sometime in the middle of 2001, and after each rsync I run a "cvs
update" in several different working directories, each of which is a
checkout of one of the main source modules on one of the branches I'm
interested in, and those include src and xsrc trunk checkouts (i.e. the
most active modules and their most active branches).  (and further I
regularly manually run "cvs update" in yet other working directories
which I'm keeping changes in, without even thinking of whether an rsync
happens to be running at the time or not)

To the best of my memory I've never seen CVS trip over a file which was
"corrupted" because rsync caught it in the middle of being updated,
either in the reports from my automated updates, or from my random
manual updates.

I.e. CVS updates the repository in such a way that revisions prior to
the update activity can be reliably retrieved concurrently to any
ongoing commits.  IIRC this is a guarantee by design.  CVS, I believe,
does do commits atomically on a file-by-file basis, even I think such
that a foreign agent can copy the repository safely during commit
activity.  My use over the past decade seems to confirm this in
practice.  (The one caveat is that I rsync from anoncvs, and that's
already a copy, though I think it too is made without regard to ongoing
commit activity.)

So, to create/update checkout of a module such that you get as stable as
possible set of sources in between any commit activity one need only
specify a date and time that's somewhere mid-way between when the last
two sufficiently separated(*) source-changes messages were generated
just before the rsync started.  That's as perfect as can be with the
existing tools(**) I think, and you can't do any better than the
developers who are feeding the changes into the repository anyway.

(*) you need to find two commits which are far enough apart in time such
that other possibly concurrent commits won't be overlapping, as well of
course such that their time delta is greater than any possible clock
skew between the original CVS server and the build server.  It will be a
compromise of course, since I doubt you'll find enough quiescent times
long enough to more than span really large commits.

(**) it might be possible to add more reliable "quiescent time" markers
to a log by adding some kind of activity counter and activity logging to
the scripts which generate the commit reports.  A quiescent time will be
one where there is a sufficient span of time between when the log is
updated with the last active commit finishing (number of active commits
transitions to zero) and the next commit starting (active commits
transitions to 1).

-- 
                                                Greg A. Woods
                                                Planix, Inc.

<woods%planix.com@localhost>       +1 250 762-7675        http://www.planix.com/

Attachment: pgpyhT2y1oTqE.pgp
Description: PGP signature

Follow-Ups:
- Re: Automated report: NetBSD-current/i386 build failure
  - From: Andreas Gustafsson

References:
- Automated report: NetBSD-current/i386 build failure
  - From: NetBSD Test Fixture
- Re: Automated report: NetBSD-current/i386 build failure
  - From: Bernd Ernesti
- Re: Automated report: NetBSD-current/i386 build failure
  - From: Andreas Gustafsson
- Re: Automated report: NetBSD-current/i386 build failure
  - From: Thor Lancelot Simon
- Re: Automated report: NetBSD-current/i386 build failure
  - From: Andreas Gustafsson

Prev by Date: help with hdafg outputs.master issue
Next by Date: Re: Automated report: NetBSD-current/i386 build failure
Previous by Thread: Re: Automated report: NetBSD-current/i386 build failure
Next by Thread: Re: Automated report: NetBSD-current/i386 build failure
Indexes:

Home | Main Index | Thread Index | Old Index