At Sat, 12 Nov 2011 12:49:29 +0200, Andreas Gustafsson <gson%gson.org@localhost> wrote: Subject: Re: Automated report: NetBSD-current/i386 build failure > > Thor Lancelot Simon wrote: > > On Fri, Nov 11, 2011 at 11:33:10PM +0200, Andreas Gustafsson wrote: > > > > > > Your assumption is correct, and the next build did succeed. Sorry about > > > the false alarm. If only there was a way to mirror the repository that > > > would guarantee a consistent snapshot... > > > > Not really possible; CVS does not have changelists. You can't do any > > better than it does for the developers themselves! Indeed, CVS can do as well as the average developer might normally do (and NetBSD developers seem to be well above average at doing this). It doesn't really need these so-called "change lists" to be as deeply integrated as people often seem to believe they should be. While the files in the repository don't have any level of "direct" atomicity with respect to "simultaneous" commits to multiple files, there is a "view" available which can show the transaction points quite reliably, and that is through the same mechanism which generates the source-changes e-mails. I.e. that same mechanism which generates the commit reports reliably identifies change sets to the precise same degree that developers are able to "mark" them by using one "cvs commit" command to commit all related changes in all related files at once. As I say, NetBSD developers do seem to be, on the whole, quite well disciplined at using CVS in this way. > So whan I'm talking about getting a consistent snapshot of the > repository, I'm really talking about read atomicity only. I'm willing > to take the small risk of snapshotting the repository while it is in > the middle of a commit, but I'd like to eliminate the much greater > risk of a commit happening in the middle of mirroring. There is no > fundamental reason why this couldn't be done despite the broken > semantics of CVS, but that still doesn't mean it's easy... I've been running periodic rsync's (every 6 hours) of the repository since sometime in the middle of 2001, and after each rsync I run a "cvs update" in several different working directories, each of which is a checkout of one of the main source modules on one of the branches I'm interested in, and those include src and xsrc trunk checkouts (i.e. the most active modules and their most active branches). (and further I regularly manually run "cvs update" in yet other working directories which I'm keeping changes in, without even thinking of whether an rsync happens to be running at the time or not) To the best of my memory I've never seen CVS trip over a file which was "corrupted" because rsync caught it in the middle of being updated, either in the reports from my automated updates, or from my random manual updates. I.e. CVS updates the repository in such a way that revisions prior to the update activity can be reliably retrieved concurrently to any ongoing commits. IIRC this is a guarantee by design. CVS, I believe, does do commits atomically on a file-by-file basis, even I think such that a foreign agent can copy the repository safely during commit activity. My use over the past decade seems to confirm this in practice. (The one caveat is that I rsync from anoncvs, and that's already a copy, though I think it too is made without regard to ongoing commit activity.) So, to create/update checkout of a module such that you get as stable as possible set of sources in between any commit activity one need only specify a date and time that's somewhere mid-way between when the last two sufficiently separated(*) source-changes messages were generated just before the rsync started. That's as perfect as can be with the existing tools(**) I think, and you can't do any better than the developers who are feeding the changes into the repository anyway. (*) you need to find two commits which are far enough apart in time such that other possibly concurrent commits won't be overlapping, as well of course such that their time delta is greater than any possible clock skew between the original CVS server and the build server. It will be a compromise of course, since I doubt you'll find enough quiescent times long enough to more than span really large commits. (**) it might be possible to add more reliable "quiescent time" markers to a log by adding some kind of activity counter and activity logging to the scripts which generate the commit reports. A quiescent time will be one where there is a sufficient span of time between when the log is updated with the last active commit finishing (number of active commits transitions to zero) and the next commit starting (active commits transitions to 1). -- Greg A. Woods Planix, Inc. <woods%planix.com@localhost> +1 250 762-7675 http://www.planix.com/
Attachment:
pgpyhT2y1oTqE.pgp
Description: PGP signature