tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: nd6 'stale' timer unreasonably long?



> Many other systems tend to drop stale entries after 20 minutes, but
> some keep them for several hours.

I've (with no good reason) set it to 10 minutes on our gateways (where the patch is applied).

> The other question is: why do you see packet loss ?

To be honest, while I had a plausible sounding theory initially, I'm not so sure anymore (Edgar Fuß raised some valid counter points).

What is notable however is that the packet loss is **much** more pronounced for clients that previously went through wifi (despite from the gateway's perspective that's indistinguishable -- our gateway itself does not do wifi).  Something like 0.5% vs 60% loss (wired vs wifi).

Anyway I'll have to reproduce the lossy situation to get another look at what's happening there, but it does seem as if neighbors get removed from the cache willy-nilly (and a comment in the source confirms there is no LRU or anything).

One observation I actually made was that an outbound packet (and ICMP reply in this case) got missing *after* being passed through ipf (as confirmed with ipf -l pass) but *before* leaving the interface (i.e. it did not show up in a tcpdump that was running concurrently)

> There is no cache limit, the garbage collection just tries to keep
> the cache size below the threshold. The problem seems to be that
> the lists are truncated from the head, which is the newest entries.
> So you often have to resolve them again, and that loses packets.

Yeah that sounds in line with what seems to be happening.  I guess ideally the limit wouldn't be reached to begin with, under normal circumstances, so I think a somewhat wonky algorithm is fine (as long as it does something least-recently-used-ish).  (IIRC the whole reason for this is a DoS-prevention of sorts, right?)


Home | Main Index | Thread Index | Old Index