Subject: Re: SoC Part I: pbulk
To: None <tech-pkg@netbsd.org>
From: Joerg Sonnenberger <joerg@britannica.bec.de>
List: tech-pkg
Date: 05/20/2007 14:20:59
On Sat, May 19, 2007 at 07:01:06PM +0200, Hubert Feyrer wrote:
> >It is hard to do so. I want to get rid of the Perl parts for obvious
> >reasons.
> 
> What is that "obvious" reason? Because it requires perl, or anything else?

No, just the "require Perl". One reason why is is that
Perl is quite big, another that it had some non-trivial issues breaking
it on new platforms.

> >The post-phase works quite a bit differently, as e.g. the check
> >for the restricted packages uses the output of the scan phase.
> 
> Are you saying that "restricted" packages aren't built in a bulk build now 
> any more?

No, just that it doesn't run a command for each package anymore to
determine whether it is restricted (we know that already). It just has
to compute the dependency hull to reduce the set.

> >>What is that filtering pattern - an addition to the list of directories in
> >>pkgsrc for pkgs to build, defined by the pkg builder?
> >
> >Variant I: www/ap-php PKG_APACHE=ap13
> >Variant II: www/ap-php ap13-*
> >
> >Both say "don't build all variants in www/ap-php, but only those
> >specified".
> 
> Ah - can we also say "please build all combinations of apache{1,2} and 
> php{4,5} with that?

Not without work. If we want to go that route, I'd prefer to have a
generic multi-version filter support. We have something like that for
Python already, it just has to be integrated. I don't honour that right
now in pbulk-index, as it is used only on Mac OS X (I think), but that
is not so difficult to do.

> What's that thing about multiple directories from pkgsrcCon - you should 
> obviously know that I (and probably others on this list) were not there, 
> and silently implying that seems wrong. Please stop what's arriving here 
> as "it's your problem that you were not there" attitude.

I thougth it was parts of the slides :-) I don't want to imply anything
like that. The problem is that we currently have pattern:directory style
dependencies. As soon as more than one package from different locations
could match, this is a problem. Splitting pattern and directory helps to
solve this and allows the user to redirect to a local package as well.
More details to come soon.

> >>Nuking $PREFIX is fine & fast, please consider update builds e.g. from a
> >>pkgsrc stable branch. No need to rebuild everything there (which was about
> >>THE design criteria for the current bulk build code :).
> >
> >Incremental builds are not affected. The Python version currently has
> >the following rules:
> >- check if the list of dependencies changed -> rebuild
> >- check if any of the depending packages changed -> rebuild
> >- check if any of the recorded RCS ID changed -> rebuild
> 
> That's about what the current one has, too, AFAIK. Where's the difference?

The current one uses only the second rule.

> And: Python version?
> (You're not telling me you're rewriting the new bulk build framework in 
> python, to escape perl, right? :-)

The Python version is the prototype implemention running for a while
now. I've used it to get an idea of how it should work first :-)

The final version that will hit the pkgsrc tree will be C and a bit
make.

> >>Also: will the bootstrap kit be mandatory on NetBSD systems, too?
> >>It should have all this in base, and while it's probably fast to build
> >>compared to the bulk build's time, but for a selective built it seems like
> >>overkill to me.
> >
> >Given the issues with requiring newer pkg_install versions, my answer is
> >"they most likely are". If not, it is easy to prepare a tarball without.
> >I haven't worried about that part too much yet.
> 
> I see... I guess the "install tarball" step would add workload on the 
> build admin, and thus automating that step would indeed be better.

s/install tarball/prepare tarball/ One very good reason why I am not
worrying that much about it now is that the tarball itself is pretty
stable. I'm normally updating it once every few month, when one of the
bootstrap related pieces changed. So it isn't too bad.

> >>How does this FTP/rsyncing before (yumm) play with the distributed
> >>environment (NFS) mentioned in answers to the proposal? Or is this for a
> >>setup with no common file system? (guessing)
> >
> >The latter. The way I'm currnetly running concurrent builds is with a
> >NFS for each of PACKAGES, DISTDIR and the log area.
> 
> Um, "the latter" would be "no common file system", thus "does not need 
> NFS". What now? :)

I am deploying it internally via NFS, because that is simple and fast.
It can be done without NFS using either FTP or rsync or similiar ways.
Ultimately the build admin has to decide what to use.

> Or asked another way round: if someone has a SMP machine, will he be able 
> to use more than one CPU?

Yes. You setup a chroot / CPU and run a client in each.

Joerg