pkgsrc-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: cdn.netbsd.org incorrect cacheing setup wrt. pkgsrc INDEX



this file was getting the very-long cache settings because it is big
and I have a rule that looks for large Content-Length and inserts a
very-long TTL.

I have added ~ /INDEX to the 1 day ttl list ( req.url ~
"pkgsrc\.tar\." || req.url ~ "/vulns/" || req.url ~ "/INDEX" )

It's funny because I don't even get an Age header for that file.

On Tue, Apr 24, 2018 at 2:50 PM, Dmitry Marakasov
<amdmi3@hades.panopticon> wrote:
> Hi!
>
> I'm maintainer of Repology.org which regularly fetches pkgsrc INDEX [1]
> in order to update data on pkgsrc packages, and I've run into a
> problem with the way fastly-powered cdn.netbsd.org caches the file.
> The problem is that it returns up old file to me, and it may be up to
> a month old. The results may be seen on the graphs [2]: no updates are
> seen until a file finally expired some days ago. Also there are rare
> cases when new file is served.
>
> The problem is rather hard to reproduce, as fastly seem to pick
> different node when the file is requested from different hosts and
> even different clients, and the file should be stuck in the cache
> of the node you're requesting it from for the problem to be seen.
> However it reproduces reliably for Repology with this command, which
> is close to what Repology itself does:
>
> python3 -c 'import requests; print(requests.get("https://cdn.netbsd.org/pub/pkgsrc/current/pkgsrc/INDEX";).headers["Age"])'
>
> currently that returns 427839 which is around five days. I guess INDEX
> has updated many times in that period.
>
> My guess is that the problem may be reproduced from any host by issuing
> the above command repeatedly for a long enough time Age will grow up to
> 1 month, and if you look into the contents, you'll see that this really
> is and old INDEX.
>
> I've written fastly [3], and they say that this behavior is expected
> and suggest that this may be a misconfiguration which can be fixed by
> NetBSD admins by reducing TTL setting. I've found a way to fix this for
> Repology, which is adding an extra argument which changes with each request
> (e.g. INDEX?time=<current timestamp>) which bypasses the cache and serves
> the fresh file, but I worry that other pkgsrc consumers may hit this.
>
> [1] https://cdn.netbsd.org/pub/pkgsrc/current/pkgsrc/INDEX
> [2] https://repology.org/repository/pkgsrc_current
> [3] https://community.fastly.com/t/cache-incorrectly-serving-old-content-under-specific-conditions/1249/3
>
> --
> Dmitry Marakasov   .   55B5 0596 FF1E 8D84 5F56  9510 D35A 80DD F9D2 F77D
> amdmi3%amdmi3.ru@localhost  ..:  jabber: amdmi3%jabber.ru@localhost      http://amdmi3.ru
>


Home | Main Index | Thread Index | Old Index