Subject: Re: FreeBSD's package cluster
To: Jan Schaumann <jschauma@netmeister.org>
From: Lars Nordlund <lars.nordlund@hem.utfors.se>
List: tech-cluster
Date: 05/20/2005 00:03:40
Mon 2005-05-16 klockan 10:37 -0400 skrev Jan Schaumann: 
> See http://people.freebsd.org/~kris/bsdcan/Package%20Cluster.pdf
> 
> What's the status on our pkgsrc clustering / parallel building?
> 

Without knowing for sure I would say "no official status". Please, prove
me wrong on this one. :-)

Perhaps you saw the patch I posted a couple of weeks ago? It adds a
'parallel' target to bsd.pkg.mk. There was no reply on that. Perhaps
people do not own SMP machines or they have already solved the problem
without telling the world about it. Well, to be fair, most likely
everyone is busy with other things and developer time is never on our
side.


To comment on the FreeBSD approach.

Seems like they beat me on the "generated makefile and letting make
itself handle the job scheduling"-approach with 6 or so years (original
work in 1999, the .pdf says). :-)

They have a bunch of scripts to handle the scheduling between different
build hosts.

They build in chroot environments doing pkg_add/pkg_delete. I am not
exactly sure how and why. Mixing kernels/userlands from different
FreeBSD versions might be the reason.

Build performance seems quite nice.


My idea on the bulk build cluster based upon the parallel-patch was to
use something like parallel/glunix, which, I think, itself can
distribute work in the cluster with the supplied 'glmake' binary. I
boldly wrote that before I had actually tried to install the glunix
package... It now appears that the glunix package is broken(?) due to a
missing file, makedepend.tar.gz, which seems to have fallen of the Net..
Anyway... With the help of some cluster software there would be no need
to write a bunch of scripts to handle the work distribution. This would
save some time in getting this up and running. Also, the resulting
solution will be more robust, I guess. In the FreeBSD report they say
that nodes coming and going is a problem and that the scripts needs to
be improved in the area of error handling. Scripts are hard that way..
Shaky, hard to make robust and so on..

Furthermore I would not bother with pkg_add/pkg_delete.. I would just
let the bulk-cluster-nodes populate an NFS-mounted /usr/pkg tree and
save the packages created as the build goes by.

I do not think chroot builds are worth the effort either. Just let the
machines reboot on different root filesystems if we want to do different
releases. This because it is probably too tricky to get something like
chroot to work together with the cluster software.


Another feature with my parallel patch, which FreeBSD currently lacks (I
think?), is the possibility to use it at any moment. Not just while
doing complete bulk builds from scratch, but also on a partly
"pkgsrc-installed" machine like the average NetBSD workstation! This,
because it generates the makefile checking with pkg_admin if a
dependancy is covered or not. Therefore I use 'make parallel | make -j 2
-k -f -' on almost all my pkgsrc builds. It does not do much on a single
CPU machine with IDE disks, but on an SMP rig it is considerably faster
for the average package. The drawback is of course that running
pkg_admin 6000 times in a bulk build situation is just a waste of time
since one knows that there are no previous packages installed.. However,
changing the parallel target to not check for package existance is
trivial.


Best regards
	Lars Nordlund