tech-net: Re: kernel ip_randomid() and libc randomid(3) still "broken"

Subject: Re: kernel ip_randomid() and libc randomid(3) still "broken"
To: Jonathan Stone <jonathan@DSG.Stanford.EDU>
From: Robert Elz <kre@munnari.OZ.AU>
List: tech-net
Date: 11/17/2003 12:57:28
    Date:        Sun, 16 Nov 2003 20:31:07 -0800
    From:        Jonathan Stone <jonathan@DSG.Stanford.EDU>
    Message-ID:  <200311170431.UAA24814@Pescadero.DSG.Stanford.EDU>

  | So unless someone offers a good
  | reason not to, I propose to commit the patch I posted within the next
  | day or so; plus any further feedback.

Please do.   Since this discussion re-surfaced, no-one has argued against it.

  | Compared to the routing table, you care about both local and remote
  | addresses, and about *per-IP-address* state not per subnet (or per
  | route state).

Yes, that's so, but perfection isn't required.   That is, we've been
doing "OK" with all destinations mapping into one ID counter.  We could
split the destinations into groups, and apply a separate counter to
each - that's fairly easy.   The only decision left is how many groups
(since they're just 2 byte counters, and can be densely packed, 1K
groups, or a 2K buffer seems a modest requirement), and how to make the
split.   One easy way would be to just use the bottom 10 bits of the
destination address, or one could use some kind of hash function for a
more even split (since destination address low bits aren't evenly distributed).

It isn't worth attempting to include the source address - different source
addresses to communicate with the same destination happens so rarely that
the improvement, considered against the extra work, just wouldn't be worth
the bother.

  | Bad Things happen to the ip-id space for the merged node.

Yes, merging can't happen - there needs to be one counter used for any
destination - but it doesn't need to be used only for that one destination.
Obviously the more counters, the better, but we don't need 2^32 of them.
having more than 1 would be nice though.

  | And T/TCP relies on a 32-bit TCP sequence number and the "fact" that
  | TCPs are supposed not to reuse sequence numbers within the 2*MSL `grey
  | area'

Actually, as I recall it, one of T/TCP's main objectives is to relax
that restriction, by keeping state.    T/TCP isn't a relevant example
here though, all of its state is soft, it can be discarded at anytime
with no greater consequences than a few extra packets when the next
connection is attempted (ie: revert to TCP packet exchange).

  | Using a cache of `recently seen IP addresses' and per-IP-address
  | ``next id'' has the flaw that a DoS, or suitably disperse IP-address
  | reference patterns (webcrawler) will blow away the cache.

Agreed - too much complexity, and far too hard to get correct.

  | If one of
  | those blown-away addresses comes back, you have to pick a new IP id.
  | But you no longer know the last-used id sent to that IP address, so
  | any guess *might* be Really Bad.

But probably won't be - consider the current code (I mean, the traditional
code, not using the random ID generator, though I don't think it changes
anything much for this) - if we have not talked to a remote destination
for some time, the ID we use when we next talk is essentially arbitrary
(depends entirely upon how much we have talked to others in the interim).
You're postulating enough other talking to have blown away the cached
ID, so clearly there's been a non-trivial amount.  The ID that will get
used could be almost anything.

But this still isn't worth the bother, the implementation is too complex
for the gain.

wrt...

jonathan@DSG.Stanford.EDU said in an earlier message:
  | This patch also stops applyling htons() to random IP IDs. I can't see any
  | point in it at all. 

No, of course not .. clearly the bits in the ID should be sorted, not
just swapped around arbitrarily!

kre

ps: what's the point that you can see in htons() the ID in the non-random case?