Subject: Re: kernel ip_randomid() and libc randomid(3) still "broken"
To: Dennis Ferguson <dennis@juniper.net>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-net
Date: 11/26/2003 15:25:20
In message <200311262253.hAQMrGX31319@merlot.juniper.net>,
Dennis Ferguson writes


>Jonathan,
>That is, since the degree of insufficiency of the ID space is inversely
>related to bandwidth,

Sure.

> then if a 12k sequence space is insufficient now
>for engineering reasons, then a 64k sequence space is certain to be
>insufficient 3 years from now for the same reasons.

I would phrase it dfferently: `insufficient' as applied specifically
to spsustained, fragmented UDP traffic; and rather than a specific
3-years, I would say "[sustained fragmented UDP over 10GbE.
I dont hae any qualms using nonfragmented TCP over 10GbE.

Otherwise we seem to be in agreement :-/.


>I even know people who have managed to get about 4 Gbps out of a single
>stream (Fast)  [...]

My understanding is they wouldn't even need FAST TCP, if they weren't
using routers from That Other Vendor, with woefully less than BW*RTT
buffering; but I defer to your judgement on that.

I also have a not-so-vague recollection that this setup used Intel
10GbE NICs running 9000-byte jumbo frames into an ATM subnet, which
doesn't require fragmentation. In which case I dont see any relevance
to my point (which, again is solely about sustained, fragmented, large
(~32k) UDP traffic).

> session on a 10 Gbps path from Sunnyvale to Geneva, and
>I know the routers they were using were easily capable of doing fraqmentation
>at this rate, so if 12k is insufficient at 1 Gbps then 64k is already nearly
>insufficient as well.

Again, in my little corner of the world, nonfragmented TCP and
fragmented UDP put completely different constraints on IP-ID space.

For non-fragmented wide-area TCP, we've all been violating the
original IP design criteria on IP-IDs and time-to-live since about
1993/1994, and I dont lose sleep over that.




>Again, if 12k breaks IP now then we won't have too wait long for 64k to
>break IP as well.
>
>I don't think either of those numbers necessarily breaks anything.  What
>I would agree with is that 64k is better than 12k, so the reduction to
>12k had better be buying me something really important.  In this case
>I don't believe it is worth it, but I think that view needs to be supported
>on a cost-versus-benefit basis, rather than a this-is-broken-that-isn't
>basis, since the latter either isn't true now or won't be true shortly.

For the specific case of NFS traffic over UDP, I think the claim that
12k IP-ID space is "broken" can defended on purely technical grounds.
Otherwise I think we're pretty much in violent agreement again.

The one thing I would add is that (in my corner of the world) the
distinction between 12k and 64k *is* important. Anyone using
NFS-over-UDP can quite reasonably rely on IP-Ids not cycling in less
than 64k.  (Anyone who did any conscious consideration, even BotE,
would assume that, and I dont agree that a factor of 6 is negligible.
That's where the broken-ness really kicks in: violiating a
reasonable-person's assumption.)

Therefore, I think the burden of justification most definitely with
those who want to change from 64k to 12k. I can live with
RANDOM_IP_IDs being ifdef'ed or even sysctl'd (though I would strongly
prefer the ability to ifdef it out altogether, to prevent SIWs).

Aside from that, again we seem to be pretty much in agreement.