Subject: Re: pkgsrc on SMP machines
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Lars Nordlund <lars.nordlund@hem.utfors.se>
List: tech-pkg
Date: 12/20/2005 00:43:14
On Mon, 19 Dec 2005 23:03:40 +0100
Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> You'll have to statically define how many jobs you want to have in parallel,
> and create a sandbox for each for them. Then the scheduler will have
> to start partial buils in each of them. This approach has several benefits:

The number of jobs in parallel would be the -j N argument to make?

What is a sandbox? A pkg_comp-chroot thingy? A chroot which offers both
ssh access and NFS exports? A virtual machine setup with all required
packages and a work directory ready to be built in by the remote system?

The sandbox must be cheap enough to create so the host system can
create 5000 of them rather fast.

As you can see in my patch, I basically do 'make install clean' in each
target in the generated makefile. This can be changed into something
that sets up a sandbox and tickles a remote system into starting to
build inside the sandbox with the help of a distcc compiler (which of
course can get CPU power from some other place). This make-rule should
not return until a package has been created and the sandbox has been
destroyed.

Well, it depends upon the sandboxes. I do not know how cheaply they can
be made and still offer full protection between two package builds. Or
what kind of functionality they can offer.

> - if we carefully choose the way the scheduler starts builds in different
>   sandbox, we can have parallel build on a cluster of systems for free.

For free?

The scheduler? You mean make(1)? Or what do you have in mind?

> - if we split the scheduling from the build processes, it should be possible
>   to have the scheduling done on a system different from the target
>   system (hum, all the prebuild stage done on a fast x86 box when
>   bulk-building for sparc :)

As long as the dependency graph is equal between the two systems, it
should work.

> Ultimately this would allow a system like that: bulk-building for sparc,
> the scheduler runs on a x86 which starts builds on several sparc systems,
> and the builds use distcc against a cluster of x86 systems.
> 
> Or you can change sparc to vax, and "several sparc systems" to "several
> simh running on a 4-way x86 system)" :)

It is software. Anything is possible.


Best regards,
	Lars Nordlund