pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: distfile mirror site preferences and how to influence them?

On Thu, 22 Dec 2016, John D. Baker wrote:
That way, attempts to fetch from them fail right away and the fetch process moves on to the next candidate mirror site.

That is double-plus-good, IMHO.

That makes me wonder about how easy it would or wouldn't be to do something similar to the way pacman under Arch Linux works (ie.. use a "best site picker"). Of course, that'd be sans all the horrible repo-shuffling they've done and the number of times they've broken their entire distro with it. Ie.. I realize it's non-trivial.

I'm intimidated by how great some of the kernel coders and just work-a-day pkgsrc coders are sometimes. However, it doesn't sound that hard to create a site-picker, because I've rolled my own on other platforms and it wasn't rocket science. The problem I've run into when I've made cursory efforts at this in the past, is that (if I understand correctly) the program defined by FETCH_USING in one's mk.conf is still only passed one of the sites from the MASTER_SITES variable, rather than all of them.

I'm green on the innards of pkgsrc, so I don't know "what you know" at the point the fetch command gets launched.

If that last bit could be changed a tad so that there is another environment variable or argument passed which contains the full list of potential sites, that would unlock some huge (to me) possibilities:

* Anybody could write their own script-based wrapper to something like aria2c that could download from _all_ the valid sites in parallel. From experience with Arch in this area, I can tell you that it gives you a jaw-dropping dramatic speed up. Aria2's network library takes care of sites that throttle and other such considerations (latency, block sizes, socket buffers, sliding windows, etc..).

* Yes, aria2c, curl, etc.. isn't in the base dist. So, I'd assert the existing "way" can be kept, but we could make it awfully easy to enable a more turbocharged download configuration that'd benefit bulk builds quite dramatically, also.

* Even without something as fancy as aria2c, I can think of at least four other strategies for determining site preference. Most can be easily scripted without any additional C code or binaries.

* If I was wishing hard, I'd also wish for some kind of metadata to be passed with MASTER_SITES to indicate someone had flagged it as a traffic shaper/limiter. That could be factored into the algorithm.


Home | Main Index | Thread Index | Old Index