At Mon, 14 Nov 2011 19:36:08 +0200, Andreas Gustafsson <gson%gson.org@localhost> wrote: Subject: Re: Automated report: NetBSD-current/i386 build failure > > I don't agree that "the part of the rsync process which can actually > be affected by a commit in progress is relatively short-lived". Even > small commits often affect files spread across multiple directories > that are arbitrarily far apart in the repository tree, such as a > header file in src/include and some source files in src/usr.bin, or a > newly added file to be installed and its corresponding entry under > src/distrib/set/lists, so the time between rsync visiting the two will > be a sizable fraction of the time it takes to scan the entire tree. If I'm not too far off track I think the affect of a single commit adding changes to multiple files will happen quite quickly in the repository no matter how far "apart" the files might be in the tree -- at least IIRC that should be true when CVS is used in client/server mode. I think I do see what you mean now though -- rsync may have already scanned some portion of the tree and thus not have seen the change having happened to the first file being affected, but by the time it gets to the second file, that file has changed and it will update it. So, yes, choosing any random timestamp prior to the rsync starting will collapse the window of "vulnerability" to one instant in time instead of allowing it to be as wide as the time it takes rsync to read in the last modified times of all the files. I guess the question is: How long of a time span is there on the files affected by the "average" commit? If it's only about a second or even two for most multi-file commits then any using any random timestamp from just before the rsync starts really will be quite reasonable for doing automated builds. After all the repository isn't really that busy. > You don't even need multi-file commits for this to happen. I saw one > case where two separate commits were made in different parts of the > tree, more than two minutes apart, and rsync picked up the later one > but not the earlier one. In theory multiple commits of related changes shouldn't happen -- that's the part I mentioned previously where I expressed my opinion that I think NetBSD developers are on the whole really quite good at actually using single commits per set of related changes. I.e., and IIUC, multi-file commits are supposed to be the normal procedure in NetBSD whenever inter-related changes to multiple files depend upon each other. There have been notable exceptions, such as when huge swaths of changes are being made across the kernel, and especially across all the machine- dependent code. I wouldn't expect the time spanning such big changes to be a time when things are stable anyway. The only thing I would worry about in this scenario is that notifications of build breaks during this time might annoy some folks unnecessarily. Perhaps if developers where able to wave set kind of flag to pause the builds, just as they usually already do for us humans by sending warning messages to current-users, then even this minor annoyance could be avoided. As an aside I think you're saying above that you've seen rsync take more than two minutes to do the file scan on the remote side. If I'm not mistaken this is given as the "file list generation time" in the "--stats" report and I normally see an average of about 70 seconds for the NetBSD anoncvs server, with the shortest time of about 30 seconds. -- Greg A. Woods Planix, Inc. <woods%planix.com@localhost> +1 250 762-7675 http://www.planix.com/
Attachment:
pgp2SbPX2gzsq.pgp
Description: PGP signature