Subject: Re: Antispamming of make fetch ?
To: Martin Weber <Ephaeton@gmx.net>
From: None <wulf@ping.net.au>
List: tech-pkg
Date: 02/01/2003 12:32:02
> 
> One thing I've noticed some time ago, is that when we need a couple
> of distfiles, we try to fetch them all sequentially. Well, that's fine,
> but I want to suggest something about how distfiles are handled which
> originate from the same site. (No idea how doable that is :)
> 
> At current we fetch each file from a list of files, and try a list of
> (potentially different) servers for each file. My suggestion now is
> to order the list of files by the sets of master-sites they try, and
> lookup if anything is outstanding to be fetched from the same server,
> and get that from the same server by doing the appropriate 'cd's, and
> the 'get' instead of closing connection and connecting over again to
> the (same) server to start from beginning.
> 
> This would (i suppose) cost a few cycles on the client end, but it also
> would reduce "spam" significantly...
> 
> df := distfile
> rs := resource-site
> 
> df1 on rs1 rs2 rs3  <-- ordered
> df2 on rs2 rs3 rs4  <-- by master
> df3 on rs4 rs7 rs9  <-- sort ?
> df4 on rs2 rs1 rs3
> 
> now: 
>    o fetch df1 from (rs's - starting rs1) (bye) then
>    o fetch df2 from (rs's - starting rs2) (bye) then
>    o fetch df3 from (rs's - starting rs4) (bye) then
>    o fetch df3 from (rs's - starting rs2) (bye)
> 
> proposed:
>    o sort df-list by memq of own rs in other rs and potentially resort rs (needed ?)
>    o loop over: try start df on (rs), if success then
>    o loop over: search other df gettable from current rs, if found then get that df
>      then (bye)
> 
> let's say it finds df1 on rs2, then it would try (without logout and ignoring all
> the other rs on other dfs) to get df2 and df4 from there (or df4 first as the
> original list had rs2 earlier ... who cares ... *duck*).
> 
> Hmm, I hope that was not too confusing ? :) I think that you might
> say that this is only a cosmetic change ... on the other hand I often
> watch a LOT of potential rs running through my terminal/log where a file
> is not found, just to end up at the 10th where I got a preceeding df from...
> 
> And yes, as I said, that I suppose would burn "some" more cycles, but I don't think
> it's massive - just convenient.
> 
> Fine, and now you may beat me :)

This is very a good idea. As a side-effect the above proposal will elimate
retries for sites that failed due to non-availabilty.

cheerio Berndt