tech-repository archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Some notes on bzr (formerly bazaar-ng)

Hi everybody. I work at the U of Saskatchewan with Greg Oster, and by coincidence asked him today if NetBSD had considered finally moving off of CVS. :-) I've been using bzr for about 6 months, including following the bzr mailing lists, and he asked that I post some of my observations here. Before deciding on bzr, I also toyed with Mercurial (hg), and looked at git (and was turned off at the sheer number of different commands).

I've been using bzr for developing two smaller projects (that are actually hosted on a Subversion server), as well as using bzr for my own personal projects. I've transitioned my old CVS projects into bzr using tailor, which seemed to work pretty well. I've also been tracking pkgsrc through periodic mirroring into a bzr tree. I'll never willingly use CVS or SVN on their own!

I've two general observations about DVCS' gleaned from reading the bzr mailing list: * It seems that most devs, once they 'get' the concepts behind DVCS, seem to prefer the first DVCS tool to which they were introduced. * The "Big 3" (git, hg, and bzr) are all under active development, and are improving every release. For example, many of the criticisms levelled at bzr in the opensolaris and mozilla writeups have been addressed. So (1) I wouldn't rely on previous (especially older) benchmarks, and (2) and current failings could be addressed within months.

There's been an effort around a common fast-import/fast-export format to simplify transitioning between different VCSs (see git-fast-import, BzrFastImport, and cvs2svn). So NetBSD could decide on one repository and switch with little pain (e.g., embedded revision-ids, though I seem to recall that some can accept revision-ids from a different system) to another system in the future. It will certainly be less painful than the initial move from CVS.

I like bzr, and I'll describe why below. Bzr's one downside is that it is slower compared to git and apparently as compared to Mercurial. In the time I've been using bzr, I've seen two projects switch to bzr: Emacs, and MySQL. Some in the Emacs crowd have been particularly vocal about the performance imbalance as compared to git; these issues seem to relate highly due to the long history involved in the Emacs repository. These performance issues are being actively addressed.

About Bzr:

Bzr is written in Python. Many of the Bazaar devs came from Darcs, but wanted to restart. Bzr is backed by Canonical (the Ubuntu folks). There currently seem to be 4 active developers, and several other pretty active developers. The developers are very responsive. The user community is active and helpful.

Bzr supports branches (independent clones of a branch) and checkouts (branches that are bound to another branch, e.g, the NetBSD mainline repository). Checkouts correspond to more centralized approaches like CVS and Subversion: a commit is only performed on the local branch if it succeeds on the bound branch too.

Bzr has a very nice plugin mechanism (providing you know python :-). Much of the functionality of the other systems is achieved in Bzr through plugins. For example, the bzr-svn plugin means I can use bzr to natively push and pull from Subversion repositories. The bzr- xmlout plugin supports retrieving information in XML form. The bzr- webserve plugin provides web-browsing support.

Bzr uses Subversion-like revnos; these revnos are only unique to a branch. These revnos map to a hash-based revision-id, which is unique between branches. You can easily reference a revno for a particular branch, as in:

        $ bzr diff -r revno:9:$HOME/pkgsrc

When pulling in a different branch, the revnos for revisions unique to that branch use a CVS-like dotted notation (e.g., 94.1.5, where the 94 is the common ancestor, 1 will be a branch id, and 5 the revision on that branch). So people can reference 'revno 1 on the mainline'. You can also use the revision-id (revid:HASH), or the more recent common ancestor to another branch (ancestor:path/to/branch), dates, tags, the last revision, ... (see `bzr help revisionspec').

Bzr's is slightly different from the other systems in that its base unit is a branch, rather than a repository. Branches correspond to a directory tree on a file system. You can organize branches as a repository however you like (e.g., a Subversion-like trunk/, branches/, tags/), but they are conceptually treated as different branches. Bzr does have a concept of a repository to share the common information held between the branches.

Bzr aims for monthly releases; this last month was a bit of an anomaly as they're landing some big changes which seem to address some of the requests posted on this list:

* stacked repositories: this means you won't have to bring down the full history of the project. Instead your branch assumes the presence of another branch for the (few) times you need it. But it won't be a true, stand-alone clone.

* The next release will include support for processing files with keyword expansion (the equivalents of $Id$, etc.). This is part of a larger ability to do content-filtering.

Some of the advantages of bzr (i.e., why did I choose bzr?)

* A simple and natural command line syntax. It's very svn-like (which I think is a good thing).

* A bzr branch or repository can be pulled/merged from across a dumb http server; although it has a smart server (which optimized the transport), you don't have to use it. You can push and pull branches across sftp too.

* Excellent plug-in support, and the bzr-svn plugin means I can use bzr to natively push and pull from Subversion repositories.

* Rename support for files *and* directories. (And there's a great plugin, automv, that uses file similarity to detect file moves). Git uses hashing to guess at file moves, which apparently works 80% of the time.

Some of the disadvantages of Bzr:

* Speed: This shows in two forms: (1) bzr's use of Python means that initial startup can take several seconds as the various Python imports are loaded and cached (this can be mitigated somewhat using the bzr- shell interface). (2) The various operations are apparently slower than git; this is actively being improved.

* Although it supports tags, the semantics for updating tags (or rather, pushing changes to tags) isn't that clear.

* Although specific to bzr-svn, it uses subversion properties to track the distributed metadata (as does svk), which are visible to other subversion users and can be seen as polluting the commits.

I'd encourage you to take a look at the user docs: they're very good, IMHO.

And finally, bzr is in pkgsrc as devel/bzr.


Brian de Alwis | Software Practices Lab | UBC |
     "Amusement to an observing mind is study." - Benjamin Disraeli

Home | Main Index | Thread Index | Old Index