tech-kern: Re: fixing send(2) semantics (kern/29750)

Subject: Re: fixing send(2) semantics (kern/29750)
To: None <tech-kern@NetBSD.org>
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
List: tech-kern
Date: 03/26/2005 23:40:15

>> [...], one need say no more than repeat the observation that, under
>> heavy network load as evidenced by full queues, one is better off to
>> drop packets at their source than to try and resorces sending them
>> into the network, only to have them dropped later.
> This is a different scenario.  The cpu is a lot faster than the nic
> card, and the nic card cannot absorb packets quickly enough to send
> it out to the network.  It is not congestion in the network fabric,
> but internal congestion.

This is no theoretically different from an infinitely fast NIC (or
"faster than any other device" if you don't like infinities) to a
switch that stops down to the actual wire speed.  In particular, the
theoretical congestion results that tell you what you should do as a
router when your buffers fill up apply equally well here.

> There is also currently no way to rate limit send so that it does not
> return ENOBUFS from the application side, [...].  I.e. I cannot even
> select or poll before I send, in order to avoid gettting ENOBUFS.

Yes.

> and this is clearly broken.

No.  At least, it's not clear to me.

You can't *guarantee* to avoid ENOBUFS, even if we made SOCK_DGRAM
sockets poll()able for write, since between the time poll shows
writable and the time you call send other processes could fill up the
interface's output queue - and likely will, if you have multiple
processes using this technique to wait for space on the same
interface's output queue.

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse@rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B