Re: Regarding summer of code 2008(writing device drivers)

To: David Young <dyoung%pobox.com@localhost>
Subject: Re: Regarding summer of code 2008(writing device drivers)
From: "Steven M. Bellovin" <smb%cs.columbia.edu@localhost>
Date: Fri, 28 Mar 2008 21:37:40 +0000

On Fri, 28 Mar 2008 15:53:54 -0500
David Young <dyoung%pobox.com@localhost> wrote:

> Delivery of IP packets is not guaranteed.  It seems that you will need
> some feedback from the receiver, in order to know that the sender and
> receiver have the same contents in their cache.  Is that so?
> 
I'd think that ordinary TCP retransmissions would take care of that.
> 
> Do you use only the checksum to detect duplicate packets?  It seems
> that there is a risk of a stream being corrupted by chance.

If the cached packet is fed to TCP, the ordinary TCP checksum would be
no worse than we have today.  Also, we're working with MD5, which is a
very strong checksum; if the sender and receiver have the same MD5 hash
stored, the only possible area for corruption is the stored packet
corresponding to the MD5 hash on the receiver.  But that's why I want
to send it to TCP's input routine for normal processing, to preserve
the end-to-end -- well, transport layer to transport layer -- checksum
semantics.

> Also,
> the technique seems susceptible to data injection.  What do you think?

I don't see why there's any more chance of it with this scheme than
with normal TCP.
> 
> Do endpoints who are using the packet-cache technique automatically
> detect each other?
> 
> Have you thought about getting routers involved?
> 
Beware packets taking different paths.  Also, when do routers discard
their caches?  What do they cache?

All this said, I'm not convinced this will work particularly well, for
several reasons.  First, how much data can be cached in RAM on the
receiving machine?  What are the odds that some other application will
want the same data within the lifetime of the cached copy?  Web
graphics are the most likely case, but the browser's cache generally
takes care of that.  Second, how expensive is the cache consistency
protocol?  Will there be more traffic maintaining the MD5 state than is
saved?  Besides, a typical web server can't maintain data very long
(especially in the kernel) for any one web client.  Finally, on many
links the cost is per-packet, rather than per-bit.

Here's a suggestion, though: only save the first ~10 packets of any
connection.  First, they're slow to arrive, because of slow start.
Second, on big files you can't save that much because (a) you can't
afford to keep much in RAM, per the above; (b) once TCP get past the
slow start phase, it will work at line speed minus the effects of
upstream congestion; (c) most connections are pretty short anyway.

                --Steve Bellovin, http://www.cs.columbia.edu/~smb

Follow-Ups:
- Re: Regarding summer of code 2008(writing device drivers)
  - From: der Mouse

References:
- Regarding summer of code 2008(writing device drivers)
  - From: pankaj gupta
- Re: Regarding summer of code 2008(writing device drivers)
  - From: David Young

Prev by Date: Re: Regarding summer of code 2008(writing device drivers)
Next by Date: Re: Regarding summer of code 2008(writing device drivers)
Previous by Thread: Re: Regarding summer of code 2008(writing device drivers)
Next by Thread: Re: Regarding summer of code 2008(writing device drivers)
Indexes:

Home | Main Index | Thread Index | Old Index