tech-pkg archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: What to do about github (dynamic) downloads



John Klos <john%ziaspace.com@localhost> writes:

> It seems that some pkgsrc packages use github for some distfiles (via
> codeload.github.com).
>
> It appears that github generates these on the fly and has decided to
> change their method, seemingly arbitrarily, which makes checksums
> fail.

One of the core principles of pkgsrc and distfiles is that checksums
should not change.  This dates from the old days when software was
always released in some form of distfile, usually foo-x.y.tar.gz.  When
upstream changes a published distfile, that's considered bad behavior.
pkgsrc uses DIST_SUBDIR to work around this; see the pkgsrc guide for a
detailed explanation.

So, if github is returning a different bytestream for a given URL that
is supposed to be a release, that's broken, according to the pkgsrc
expectation of what a release is.

In these days of discussion of reproducible builds, changing what
amounts to distfiles seems like a serious problem.   I wonder if you are
able to communicate with upstream and have them complain to github to
fix this.

> Should it be decided, whether by concensus or a decision by
> pkgsrc-pmc, that NetBSD should avoid services such as github which do
> this kind of dynamic packaging?

We tend to go light on policy unless really necessary.  I would say:

 - people should use DIST_SUBDIR if upstream changes a release, whether
   by replacing the file or changing their process

 - when packaging, if there is a distfile available in a reliable way
   (like a file on a http/ftp server), I think it should be preferred
   over files that are generated on-the-fly, at least as long as the
   on-the-fly generation appears unreliable

 - Note that normally, distfiles are fetched and mirrored on
   ftp.netbsd.org.  However, this doesn't really address the issue
   because the DIST_SUBDIR approach is still needed when they change,
   whether because of changes in the generated process or because an
   upstream decided to replace the file with different contents.

 - Remember that changed distfiles can be an attack.   Diffing them like
   you did is good practice.

Overall, I'm not quite sure what you're asking for.  If you want to fix
a pkgsrc package to use a more reliable (and authorized by upstream)
distfile location, that seems fine, modulo the usual MAINTAINER/OWNER
issues.  If the upstream has no reliable distfile location, that's a bug
to be fixed in upstream, not a pkgsrc bug, but then pkgsrc has to work
around it.

If you're suggesting that everyone be aware of this issue and try to
make choices to have more reliable distfiles, balancing all the other
concerns, that sounds good (but also not very prescriptive :-).

If you're asking for a sweeping policy statement that github generated
distfiles are banned for distfiles, I don't see that happening.

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index