Subject: Re: bsd.bulk-pkg.mk and recursive broken marks
To: Todd Vierling <tv@duh.org>
From: Dan McMahill <dmcmahill@NetBSD.org>
List: pkgsrc-bulk
Date: 10/15/2005 14:24:22
Todd Vierling wrote:
> You two have the most infrastructure commits here, so I figured at least one
> of you would be able to provide some thoughts on these points, in case you
> aren't subscribed to pkgsrc-bulk.
> 
> One of the annoyances about Interix's fork() problems is that invoking bmake
> recursively is *expensive*.  I'd like to avoid doing that where feasible --
> particularly in deep loops.  Today I've been observing some really bad
> behavior in the "marking package that requires XXX as broken" phase; it can
> take many hours to complete if the list is very long.  This is mostly thanks
> to the following block:
> 
> =====
> pkgerr="-1"; \
> pkgignore=`(cd ${PKGSRCDIR}/$$pkgdir && ${MAKE} show-var VARNAME=PKG_FAIL_REASON)`; \
> pkgskip=`(cd ${PKGSRCDIR}/$$pkgdir && ${MAKE} show-var VARNAME=PKG_SKIP_REASON)`; \
> if [ ! -z "$${pkgignore}$${pkgskip}" -a ! -f ${PKGSRCDIR}/$$pkgdir/${BROKENFILE} ]; then \
>          ${ECHO_MSG} "BULK> $$pkgname ($$pkgdir) may not be packaged because:" >> ${PKGSRCDIR}/$$pkgdir/${BROKENFILE};\
>          ${ECHO_MSG} "BULK> $$pkgignore" >> ${PKGSRCDIR}/$$pkgdir/${BROKENFILE};\
>          ${ECHO_MSG} "BULK> $$pkgskip" >> ${PKGSRCDIR}/$$pkgdir/${BROKENFILE};\
>         if [ -z "`(cd ${PKGSRCDIR}/$$pkgdir && ${MAKE} show-var VARNAME=BROKEN)`" ]; then \
>                 pkgerr="0"; \
>         else \
>                 pkgerr="1"; \
>         fi; \
> fi; \
> =====
> 
> Is the extra check to see if the package is already PKG_FAIL, PKG_SKIP, or
> BROKEN necessary?  Note that there are three bmake invocations here, and
> that's killing me when something with a crapload of dependencies, such as
> groff, breaks.  For the currently running build, it started marking groff
> dependencies at 1:48AM this morning, and was only about half done when I
> ^C'd it and restarted the bulk build.
> 
> The only thing I can think that this does is change the "breaks" column of
> the final report, by lowering the numbers for packages that are themselves
> broken -- but is that really needed?  After all, it's dependency breakage,
> and the only real reason for the "breaks" column is to indicate a severity
> of the dependency breakage.  If that number includes packages that are
> themselves broken, I don't really see that as a vital problem.  Eliminating
> this looped check should speed up bulk builds on all platforms, not just
> Interix, but perhaps not as dramatically on others.
> 
> Now, I don't really understand the magic going on with the output
> $PKGSRCDIR/$BROKENFILE data from just this block of code, but maybe you
> understand its format more readily.  Are the values there used for anything
> other than the final bulk report?

It has been several years now, but I think it is only for the final 
report.  I vaguely recall that had something to do with not wanting all 
the compat_linux packages to make the build report look bad on pmax or 
some such thing.

> =====
> 
> And one final note:  Does USE_BULK_CACHE=no still work, or should I expect
> that to break spectacularly now?  It's been forced =yes in mk/bulk/build for
> a long time, so I don't know if setting it =no in a full bulk build will
> work at all.  8-)

It will probably still "work" although my feeling was it never worked 
correctly to begin with.  The whole bulk cache thing started becuase

a) the dependency ordering was wrong.  The bulk code would decide on a 
rebuild if dependencies had been repackaged, but unless you built in the 
right order, you could run the same build over and over on the same 
source tree and keep getting things rebuilt.

b) without the bulk cache, there were zillions (ok, 100's of thousands) 
of calls to make while recursively processing the dependencies.  I 
wanted to call make once per package and then have the entire up/down 
flattened tree avaialble for quick lookup

c) the recursive broken marks are actually faster than the old way.  In 
the old way, if something like png were broken, you might go to build a 
package, install 27 dependencies, find out that #28 (png) was broken and 
then deinstall all 27.  That took forever on the pmax.

d) a restart of a bulk build was _painful_ and involved potentially 
thousands of make calls.

e) it became easier to deinstall packages not listed as dependencies at 
the start of the build of a particular package instead of deinstalling 
all packages at the end of a the build of a particular package.  While 
somewhat hard to quantify, this led to less instances of deinstalling 
perl and promptly reinstalling it.  When you get into the kde or gnome 
packages on a slower system with a slower drive, this can be a big help.

so, not using the bulk cache probably still works, but you'll be left 
with these issues.

-Dan