Subject: Welcome pbulk!
To: None <tech-pkg@netbsd.org>
From: Joerg Sonnenberger <joerg@britannica.bec.de>
List: tech-pkg
Date: 06/19/2007 22:39:09
----- Forwarded message from Joerg Sonnenberger <joerg@netbsd.org> -----

Log Message:
Initial import of pbulk, the new pkgsrc bulk build framework.

----- End forwarded message -----

OK folks, welcome pbulk in the tree. It is still missing quite a bit
documentation, so this is a short introduction to using it.

The default configuration is what I am using for the DragonFly bulk
builds, so you definitely want to modify that :-) Most basic functions
can be adjusted using pbulk.conf, but for some changes just copy the
reference script and modify the location in pbulk.conf.

Two good examples that you might want to modify:
- pkg-up-to-date is used to determine whether a package needs to be
rebuild. This check is extremely strict right now. It compares recorded
RCS IDs from +BUILD_INFO with the current versions in the tree. It
compares that all dependencies used for the build are the same. It
checks that no depending package is newer than the package. For most
setups, you might want to comment out the last test.
- build, pre-build and client-clean have some written down rules for
cleanup. Again, this is what I'm using in the bulk builds, but you
should definitely look at it first. Especially if you run qmail...

A few rough notes about the configuration:
base_url is beginning of the URL in the report mail.
master_mode controlls whether or not parallel building is done. Look at
build-client-start and scan-client-start for ways to use that. If you
have a multi CPU system, you could for example setup a chroot for each
CPU and run pbulk-build in client mode inside. For a network based setup
like I'm using, you the important parameters are the (public) IP of the
server and the list of clients to start. More than 5 clients can create
a problem with the TCP listen backlog, I'm not sure how to best deal
with that so far. The communication itself is done over TCP. If secure
transport is needed, setup either IPsec or ssh port forwarding.

*_rsync_args is a reasonable default for uploading the reports and
packages. Try without --delete-excluded first, otherwise the wrong files
might be removed.

*_rsync_target is the dst to use for the upload. Only non-restricted
packages are uploaded by default and if any dependency is restricted,
the full package is skipped.

bootstrapkit can be unset on NetBSD. It should be the name of the binary
bootstrap kit, with your own mk.conf changes merged. You can also use it
to always include e.g. distcc. There's no support for using a
non-default PKGDB, so in that case you might want to create a tarball
with just pkgdb and pkg_install even when on NetBSD.

bulklog is the area where the meta data and the build logs are written
to. This can be shared between clients or aggregated later.

packages is same as PACKAGES in mk.conf, similiar for prefix, pkgsrc,
pkgdb and varbase.

pkg_install_prefix should be the same as prefix if using pkg_install
from pkgsrc, otherwise set it to /usr. Be warned that building with a
non-default pkgdb is not support yet! If pkg_install_prefix == prefix,
external_pkg_info needs to point to a binary outside prefix. A good
approach for that is a second bootstrap e.g. in /usr/pkg_bulk, which is
useful to install the other dependencies like rsync into as well.

The rest of the file is mostly the location of the tools. Don't change
loc unless you know what you are doing!A

Joerg