Subject: Re: pkgsrc on SMP machines
To: Lars Nordlund <lars.nordlund@hem.utfors.se>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-pkg
Date: 12/22/2005 19:38:54
On Thu, Dec 22, 2005 at 01:16:10AM +0100, Lars Nordlund wrote:
> On Tue, 20 Dec 2005 19:22:30 +0100
> Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> > > The sandbox must be cheap enough to create so the host system can
> > > create 5000 of them rather fast.
> > 
> > Each require several null mounts, I'm not sure a system would support
> > that much. You certainly don't want one sandbox per package, but
> > one per jobs (if you can run 4 jobs in parallel, you want 4 sandboxes)
> 
> What kind of breakage do you foresee if builds are happening in more
> than one place in the pkgsrc tree at the same time? Is this what you are
> afraid of and therefore want one sandbox per job?

What the bulk build scripts do is to keep in the sandbox the packages
on which depends the package being build, and only this one.
This is to avoid hidden dependancies because a configure script would
pick up a third-party binary or librarie if it finds it, even if we don't
want to this package to depend on it. This is something that has to
stay (or we have to fix the configure scripts)

> 
> The following is probably already clear to you, but I'll say it anyway;
> 
> With the 'make parallel'-patch, and how make works and so on, only
> "leaves" will be built at any given moment. When a package is built, it
> will never have to discover that some other package is missing and go
> and build it.

This is how the bulk build script works too.

> > I'm not sure your system can be used for bulk builds. The scripts in
> > pkgsrc/mk/bulk do much more than that (like making sure only the required
> > packages are installed)
> 
> The make rule could be adjusted to this:
> 
> (example for package bar in category foo, which has no un-built
> dependencies)
> 
> foo/bar: <some already built and packaged dependencies>
> 	<somehow pick a free sandbox and use it>
> 	<install required packages from package repository>
> 	make package clean
> 	<copy package to repository>
> 	<deinstall all packages in this sandbox>

As an optimisation we're currently doing:
foo/bar: <some already built and packaged dependencies>
	<somehow pick a free sandbox and use it>
	<install required packages from package repository>
	<remove installed but non-required packages>
	make package clean
	<copy package to repository>

Installing/deinstalling packages takes a fair amount of time, and
some large part of the builds would just reinstall packages wich have
just been removed (it's common to build in a row several packages which
all depends on the same things - see the kde locale packages for example)

> I am at this point not sure how to 'pick a free sandbox' could be
> implemented. Perhaps a wrapper script which does the rest of the above,
> and chroot it? Hmm. It is late, I have probably missed something
> obvious.. :-)

Yes, just start a script in a chroot. But probably with a lot of parameters
to the scripts.
For optimisations, and in order to avoid recomputing the same thing
again and again, it'll probably have to be a bit more complex than that,
with caches.

> 
> Never having used the bulk and tested mk/bulk/mksandbox.. Is it
> possible to have two or more of these active on the same machine at the
> same time?

Yes, but in different chroots :)

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--