tech-kern: Re: kernel filesystem code move

Subject: Re: kernel filesystem code move
To: Perry E. Metzger <perry@piermont.com>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 01/02/2003 19:49:28
[ On , December 28, 2002 at 23:53:43 (-0500), Perry E. Metzger wrote: ]
> Subject: Re: kernel filesystem code move
>
> cgd@broadcom.com writes:
> > At Sun, 29 Dec 2002 00:47:20 +0000 (UTC), "Perry E. Metzger" wrote:
> > > How about if one did a copy but altered the dates in the duplicate? It
> > > is pretty easy to do that. One then retains a modicum of revision
> > > history.
> > 
> > So, for the sake of "keeping history" you change that history?  "Hmm."
> 
> I have a slightly better suggestion, then. You alter CVS to allow you
> to specify a "don't do checkouts by date on this file before date X"
> or "before revision X" or some such. That allows us to move the file
> and still avoid breaking checkout by date.

You still have the problem of the missing/modified tags.  You have to
remove or uniquely modify all the existing release and branch tags in
the new copy.  That's definitely altering some very important history
information.  If you do the simple and safest hack of just removing all
old tags then you still often have to refer to the old copy for what
many (most?) of us consider to be extremely important details for a
project like NetBSD.  If you don't remove the tags but instead find some
way to uniquely rename them while preserving their old meaning to a
human reader then you have to ensure your naming conventions for the
renamed tags allow for multiple moves of files and you have to hope
users can intuit the meaning of these renamed tags when they can't find
the documentation for your tag naming conventions (keep in mind that tag
names are effectively severely limited in length).

Doing file moves the plain and proper CVS way (i.e. "cvs rm" the old
file and "cvs add") really is the best and only sure and safe way to
move files in a project managed with CVS.

Anyone who says that the "rm/add" process deletes revision history isn't
really looking at the big picture.  Not doing moves the standard way is
what mucks up revision history!  Nothing is lost with the "rm/add"
process -- and if a suitable commit message is included for both
operations then all the information necessary to track multiple moves is
in fact added and nothing is ever deleted:  it's all right there where
it always was.

Wrapper scripts can even be implemented which make all the standard
auditing operations work across moves.  One often recommended way to do
this is by creating and later interpreting meaningful special phrases at
key locations in the commit messages used for the removed and added
revisions.  Thus even without the help of these wrapper scripts a
competent CVS user can still very easily follow everything that's
happened to a file.

I do sort of like the idea of also including the old file's full "cvs
log" output in the first commit message of the new file.  This makes the
the release and branch tags and their revision numbers, as well as all
of the commit messages, available immediately in the "history" of the
new file and that can eliminates at least one step from the job of
looking at previous changes in the old file.  One could also always add
a custom "newphrase" to hold this data in the delta header, though then
you'd probably need a custom version of CVS to conveniently access it
again.

> You might ask, why do I care. The reason is primarily dealing with
> things like vendor branches and merges with other codebases
> cleanly. It is nice to be able to keep history around and use it.

Yes, it is nice to keep vendor history around, BUT:

CVS "vendor" branches are fundamentally incompatible with "normal"
branches in the same files/module (due to the way the "default" branch
must be managed to make vendor branches work properly), especially if
any important branch (eg. "-current") is in fact the trunk (because of
course a "cvs import" can immediately change the head revision of what's
checked out from the trunk and will thus distrupt everyone working on
the trunk.  CVS vendor branches are nowhere near as useful as was hinted
they might be in the original CVS paper.  They work great for a small
team managing stand-alone modules with frequent vendor imports with very
few local changes and no local branches, but that's about it.

If you really want to track vendor revisions to files which are part of
a larger collective set of modules then you should create a full,
normal, CVS branch for that sub-module and you should commit vendor
releases directly to that branch (i.e. work directly on that branch as
if you were the vendor working on a private branch) and do all merges to
other branches the normal "manual" way.

The CVS-style "vendor branch" should never ever have been introduced to
any files in any of NetBSD's repository.  It may even be worth while
writing a script to untangle the current mess in those files that do
now have vendor branches.

Once you've got vendor releases on their own normal CVS branch then once
again it's simplest just to do the standard "rm/add" to move a file --
since then you can fully track the moves on a per-branch basis.

> > As far as I'm concerned, disincentive to move files around would be a
> > positive thing.  8-)
> 
> I don't think so. Reorganizing the kernel sources, for instance, would
> be nice. The desire to move the filesystem related files and possibly
> reorganize portions of sys/arch by CPU type etc. seem quite
> reasonable, but are projects people have long avoided for fear of
> breaking the log revision history too badly.

While I agree rather strongly with Chris, it really doesn't matter.  So
long as CVS is being used (or anything even remotely like it using just
RCS or SCCS to track changes, i.e. anything without a complete way to
track not only file contents but also file locations is being used) to
manage the repository, then the whole issue is moot.

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>