tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: python woes in netbsd-10 bulk builds



> Date: Sun, 14 Jul 2024 08:53:29 -0400
> From: Greg Troxel <gdt%lexort.com@localhost>
> 
> The netbsd-10 official package builds for i386 and amd64 are troubled.
> There is a PR open about this, which I was not aware of.  I'm posting
> the link in the hopes that someone who understands python has a useful
> comment.
> 
> http://gnats.netbsd.org/58401

If we had a process with a well-defined criterion for when we switch
the symlink, a well-defined target that people could collaboratively
focus on and work on fixing, maybe we would have noticed this before
switching the symlink and putting egg all over our faces like we seem
to do every quarter.

Apparently the pkg_summary.gz for the amd64 10.0_2024Q2 directory has
also failed to update for weeks:

-rw-rw-r--  1 pkgmastr  netbsd     3315698 Jun 28 17:24 pkg_summary.gz
...
-rw-r--r--  1 pkgmastr  netbsd     2819684 Jul  2 19:46 squid-6.10.tgz
-rw-r--r--  1 pkgmastr  netbsd      952640 Jul  2 19:47 openssh-9.8p1.tgz
-rw-r--r--  1 pkgmastr  netbsd     1215220 Jul  6 16:58 py311-setuptools-70.0.0.tgz
...
-rw-r--r--  1 pkgmastr  netbsd        5432 Jul  8 04:38 py311-calver-2022.6.26.tgz
...
-rw-r--r--  1 pkgmastr  netbsd     5326556 Jul 12 03:28 apache-2.4.61.tgz
...
-rw-r--r--  1 pkgmastr  netbsd     3044212 Jul 12 05:02 ap24-ruby32-passenger-5.3.7nb17.tgz

It would also help if we had a process with an audit trail so that we
can always go from a repository to the bulk report for the build that
produced the content of that repository.


So, I have suggestions for how to proceed.

1. Endorse bulk-test-essential as a target.

   If it doesn't build, raise an alarm and don't publish binary
   packages.

   This way, we can either fix its dependencies, or discuss changing
   its dependencies if it can't be fixed -- but it's a conscious
   decision in reaction to an early alarm that we can collaborate on
   as a shared target, not a delayed reaction to weeks of festering
   problems that nobody noticed because we have no concretely testable
   quality assurance criteria.

   bulk-test-essential is already tailored to the plausible successful
   package on various different architectures and operating systems,
   e.g. it doesn't expect firefox to build on m68k but it does expect
   it to build on amd64 and aarch64.


2. Change

      /pub/pkgsrc/packages/NetBSD/<arch>/<version>/All

   to be a 302 Found redirect (via .bzredirect) to

      /pub/pkgsrc/packages/NetBSD/<arch>/<version>/<buildid>/All

   where <buildid> is a pbulk-chosen build id, like 20240708.0436, and
   <version> is (e.g.) 10.0 or 10.0_2024Q2 or whatever.

   Once a build has been uploaded, no changes to any of the files in
   it.

   This way, we eliminate any issue of stale CDN caches that we've
   constantly inflicted on ourselves by abusing symlinks, and any
   issue of mixing stale packages that have begun to fail to build
   with fresh packages that are incompatible.

   => To reduce storage on the server, we can use hard links via
      `rsync --link-dest=../../<prevbuildid>/All'.

   => Or, to reduce storage and bandwidth, we can set up redirects for
      packages that haven't changed so the CDN will continue to use
      the old and still-valid cache.

   (If this has to remain compatible with ftp to the same paths, we
   can jiggle things around another way, like having

      /pub/pkgsrc/packages/NetBSD/<arch>/<version>/All

   be a symlink to ../build/<version>/<buildid>/All, and having

      /pub/pkgsrc/packages/NetBSD/<arch>/<version>/.bzredirect

   point to ../build/<version>/<buildid>.)


3. Include either
   (a) the bulklog directory, or
   (b) just the bulklog/meta/ directory, or
   (c) at least a redirect to the bulk report
   in the package upload at, say,

      /pub/pkgsrc/packages/NetBSD/<arch>/<version>/<buildid>/bulklog

   (It looks like bulklog/ is usually a few hundred megabytes, so it's
   not that space-intensive on top of the whole bulk build.  But maybe
   just a redirect to the builder's own URL is good enough.)

   This way, when examining a particular binary package set to see
   what went wrong, it is easy to find the report for exactly that
   build -- no need to guess at which message to pkgsrc-bulk might
   correspond with which binary packages, if any.


4. Have a program to update the .bzredirect only if it passes certain
   QA criteria, such as:

   (a) All/pkg_summary.gz exists and is newer than all of the packages
   (b) All/SHA512 exists and is newer than pkg_summary.gz
   (c) bulklog/meta/success exists and each package listed is in All/
   (d) bulklog/meta/report.bz2 exists is newer than All/SHA512 and
       contains bulk-test-essential
   (e) or, perhaps, pkgin is able to download pkg_summary.gz and query
       bulk-test-essential or something like that

   and otherwise sends an alert to a low-volume list that people pay
   attention to.


5. Disable the cron job that regenerates pkg_summary, because the bulk
   builders already generate it and this is an unnecessary potential
   source of inconsistency.


Home | Main Index | Thread Index | Old Index