Subject: Re: sendto() and ENOBUFS question..
To: None <tech-net@netbsd.org>
From: None <sudog@telus.net>
List: tech-net
Date: 05/15/2002 00:41:05
On Tue, 14 May 2002, Justin C. Walker wrote:

> > So, the question is, does select() or poll() make a check to ensure
> > that ENOBUFS won't show up when udp_output() is called? Or should it?
>
> Nope; select/poll don't have any way to determine this, since it isn't
> recorded as "state" anywhere.  UDP is "best effort" delivery, which
> means that if the packet can't be delivered for any reason, it gets
> dropped, and very little is done to note that fact.

Then the select check for writeable status on the SOCK_DGRAM is
meaningless in its current form. In that case, maybe it should either
fail with an error message (something else than is listed in the
manpage) or be modified to clearly indicate the path is reliable down
to the network interface at that particular moment. I guess that would
mean reserving a small amount of memory for each SOCK_DGRAM driven by
select()'s wouldn't it.. and a queuing mechanism.. which again would
be better implemented in userland. Hrm.

Buffered UDP would be what I'm looking for..  but isn't there a send
buffer associated with a SOCK_DGRAM that can be adjusted?

Anyway, the fact that it's unreliable doesn't mean the local machine
shouldn't do something to help it become as reliable as possible. The
socket-full errors that crop up for receivers are perfectly natural
and logical;  after all if the receiver can't cope with the data, it's
normal to start dropping packets -- *on the receiver's end*.

> > If not, how would I, in userland, cleanly wait until I can do another
> > udp_output safely without ENOBUFS showing up?
>
> As someone else mentioned, application-layer flow control is your
> answer.  Think about the underlying transport, and what it provides:
> some probability that a datagram you hand to the kernel with "send()"
> will get to the other end.  There's no guarantee about order or even
> eventual delivery.

Right--but that's the assumption *once it's out on the wire.* Getting
it to the wire to begin with shouldn't be an exercise in futility.

> If you are using UDP, your application is, in essence, accepting
> the fact that delivery of data is up to it, not the system.  You
> choose UDP because the terms are acceptable.  If you want to
> guarantee delivery, using UDP, you have to resort to timeouts and
> positive acknowledgements.  There's no other way that I know of.

Timeouts and positive acknowledgements are fine--I can easily build
that logic into the system. But when select() tells me I can write to
the socket, I should be able to write to the socket unless someone
else gets there first and eats the last mbufs. If the write fails,
then select() is unreliable and we're back around to basically useless
system call functionality that pretends to run fine on a SOCK_DGRAM
but really doesn't.

> It's up to your app.  "patching" the kernel would "break" UDP.

Making select() work properly would mean that the kernel would have to
deal somehow with who should get woken up first.

Well at least now I can better see why it was done the way it was
done. I think there's a better way. But for people like myself who
think TCP performance is quite pathetic and would like finer-grained
control over our network communications, what alternative is there
except to implement another protocol? And if UDP won't cut it as a
base to work from, is there something else that might? SOCK_RAW,
listening on a custom protocol number seems a better method. Ah,
there's one. Protocol number 68!

For the record:

. Yes, I have the time to fiddle around with this.
. Yes, I like doing this.
. No, TCP isn't cutting it because of all the compatibility, the poor
packet loss recovery, and the crappy in-order error correction it has
to enforce which is IMHO, less than suited for finite chunks of data
like files.

-Marc