Subject: Re: pkgsrc on SMP machines
To: Lars Nordlund <lars.nordlund@hem.utfors.se>
From: Dieter Baron <dillo@danbala.tuwien.ac.at>
List: tech-pkg
Date: 12/22/2005 10:21:15
On Thu, Dec 22, 2005 at 01:16:10AM +0100, Lars Nordlund wrote:
> On Tue, 20 Dec 2005 19:22:30 +0100
> Manuel Bouyer <bouyer@antioche.eu.org> wrote:
> > > The sandbox must be cheap enough to create so the host system can
> > > create 5000 of them rather fast.
> > 
> > Each require several null mounts, I'm not sure a system would support
> > that much. You certainly don't want one sandbox per package, but
> > one per jobs (if you can run 4 jobs in parallel, you want 4 sandboxes)
> 
> What kind of breakage do you foresee if builds are happening in more
> than one place in the pkgsrc tree at the same time? Is this what you are
> afraid of and therefore want one sandbox per job?

  Let me explain how the bulk build works and why it is done that way:

  Bluk builds are usually run in a chroot to provide a clean and
well-defined environment (and also, because they install and deinstall
packages, to keep the build system from being changed).  This is not
strictly necessary, however; they could also be run directly on a
clean system.

  First, out of date distfiles and binary packages are deleted.

  Then, the dependency graph of all packages to be built is
constructed and sorted topologically (dependencies come before the
packages that depend on them).

  Then for each package in turn, the following is done:
    . if the package is marked as broken, skip this package
    . if a binary package is available and up to date, skip this package
    . deinstall all installed packages that are not depenencies of this package
    . install all dependencies not already installed
    . build package
    . if build failed, mark this all depending package as broken

  Lastly, a report is generated, listing the broken packages.

  This ensures that when we build a package, only its dependencies are
installed (to avoid random interference from other pacages).  It also
ensures that at the start of a package build, either all dependencies
are available as binary packages, or we know that we won't have to try
building this package.

> The following is probably already clear to you, but I'll say it anyway;
> 
> With the 'make parallel'-patch, and how make works and so on, only
> "leaves" will be built at any given moment. When a package is built, it
> will never have to discover that some other package is missing and go
> and build it.
> 
> Yes I know about the pkgsrc locks and all that, and I have also seen
> that it does not always work. But 'make parallel' shall never need to
> rely on the locking. All builds will be in different packages all the
> time.

  That is not sufficient.  If two builds install into the same
LOCALBASE (and thus into the same PKGDBDIR), the lock is needed to
prevent PKGDBDIR from getting corrupted.

> > I'm not sure your system can be used for bulk builds. The scripts in
> > pkgsrc/mk/bulk do much more than that (like making sure only the required
> > packages are installed)
> 
> The make rule could be adjusted to this:
> 
> (example for package bar in category foo, which has no un-built
> dependencies)
> 
> foo/bar: <some already built and packaged dependencies>
> 	<somehow pick a free sandbox and use it>
> 	<install required packages from package repository>
> 	make package clean
> 	<copy package to repository>
> 	<deinstall all packages in this sandbox>

  Which is pretty similar to what the bulk builds do, with one
important optimaization: packages are not deinstalled and reinstalled
needlessly.

> I am at this point not sure how to 'pick a free sandbox' could be
> implemented. Perhaps a wrapper script which does the rest of the above,
> and chroot it? Hmm. It is late, I have probably missed something
> obvious.. :-)

  I would implement it the other way around: After creating the
dependency graph, the following processes are started: one scheduler
process, and one build process each sandbox.

  The build processes ask the scheduler process for the next package
to build, and builds it and tells the scheduler about success/failure.
It does this in a loop until no more packages are to be built.

  We moved away from tracking the build order with make because it
prooved too unwieldy.  I cannot imagine how it would work better with
the added complexity of parallel builds.

  Your approach is clever, but it does not suit bulk builds well.

						yours,
						dillo