Subject: Re: pkgsrc on SMP machines
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Lars Nordlund <lars.nordlund@hem.utfors.se>
List: tech-pkg
Date: 12/22/2005 22:54:23
On Thu, 22 Dec 2005 19:38:54 +0100
Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> What the bulk build scripts do is to keep in the sandbox the packages
> on which depends the package being build, and only this one.
> This is to avoid hidden dependancies because a configure script would
> pick up a third-party binary or librarie if it finds it, even if we don't
> want to this package to depend on it. This is something that has to
> stay (or we have to fix the configure scripts)

Ok, no problem. It is just so much easier to solve a problem when
all the requirements have been specified..

> This is how the bulk build script works too.

Ok.

> Installing/deinstalling packages takes a fair amount of time, and
> some large part of the builds would just reinstall packages wich have
> just been removed (it's common to build in a row several packages which
> all depends on the same things - see the kde locale packages for example)

This optimization will be tricky to keep. Perhaps learn make(1) to sort
build jobs so the ones with the same dependencies are done "as much in
parallel as possible".. Uhm, no. They should not be done in parallel,
instead they should be put into one specific 'job-queue'.. And how to
control which 'job-queue' a target is run in? No, forget this. :-)

I think the win will be bigger to be able to schedule builds over many
machines, then to optimize for the fewest amount of pkg_add/pkg_delete.

> Yes, just start a script in a chroot. But probably with a lot of parameters
> to the scripts.
> For optimisations, and in order to avoid recomputing the same thing
> again and again, it'll probably have to be a bit more complex than that,
> with caches.

FreeBSD used to (perhaps still have?) problems with OpenOffice (and a
few other beasts) in their bulk build cluster. They wanted to make
make start with them as soon as possible and not waste the smaller
packages on all machines for longer periods of time.. Do not know how
to explain better.. Take for example qt/kdelibs/kdebase .. One chain of
dependencies.. Big packages also.. You really want to make sure you
have enough of the other packages available to build in parallel with
these big boys.

So.. I have been thinking about some way to hint about "heavy targets"
in a Makefile to get better multi-job scheduling out of make.. But this
is also a bit too much science fiction yet.

> > Never having used the bulk and tested mk/bulk/mksandbox.. Is it
> > possible to have two or more of these active on the same machine at the
> > same time?
> 
> Yes, but in different chroots :)

Ah :-)


Best regards,
	Lars Nordlund