Re: New class of receive error

To: Robert Elz <kre%munnari.OZ.AU@localhost>
Subject: Re: New class of receive error
From: Roy Marples <roy%marples.name@localhost>
Date: Sun, 13 May 2018 17:40:30 +0100



On 13/05/2018 16:30, Robert Elz wrote:

     Date:        Sun, 13 May 2018 15:01:42 +0100
     From:        Roy Marples <roy%marples.name@localhost>
     Message-ID:  <12e84d93-9f83-359c-3fd4-17f359f289de%marples.name@localhost>

   | Other OS's document ENOBUFS on recv calls.

That other OS's (by which I assume you mean linux) are broken is no reason
that we should be too.


By other OS's I mean AIX, HPUX, Solaris and the POSIX specificaition?
I stopped checking others at this point but it's clearly not just Linux.

Datagram protoctols inherently lose packets - aside from in this one very
special case, there's no way at the network level to inform the receiver of
a lost packet, as there's no way to know one was ever sent.


I'll stop right here.

The network isn't involved in a few cases. AF_LOCAL and PF_ROUTE don'tgo over it.Also, even when the network is involved, once it's gotten into thekernel we should be dealing with it on a best case, including notingthat whatever it got cannot be delivered.

The application
level needs to recover using its own mechanism - which could be using sequence
numbers in the packets as Christos suggested - but here I do not think can
work, as the messages come from a variety of sources (including other
processes) and there's no way to synchronise the (current) sequece numbers
(rtm_seq).

But we could add a kernel geneerated seq number - set whenever a routing
packet is generated, and delivered to the receiver - that at least would be a
application layer recovery mechanism.

Aside from dhcpcd every instance of "handling" this error is to (at most)
log it and ignore it - it really is pointless.

The routing socket is something special - it arguably should not be using the
socket interface at all - as it is puerly a local host communication mechanism,
so the local host OS knows when a packet is sent, and when one is received,
and when one is lost, and so can (reliably) inform the receiver that a  packet
has been lost - but that one is a very special case (along with the mobile IP
socket, which is just the routing socket with a different name (and purpose)).

When I originally worked out what the issue was that some people wherehaving on NetBSD, I proposed (not on email) that we adopt RTM_DESYNCwhich OpenBSD implemented for route(4). That however was shot down inflames (oddly enough by one person now complaining on this list) as "wedon't want another magical message on route(4)". After thinking about itmore, I agreed as it makes it situation worse by trying to send more data.


joerg then suggested "What about a KNOTE in kqueue(2)?"
I did actually post a working implementation here:
https://mail-index.netbsd.org/tech-net/2018/03/15/msg006749.html

But it got no feedback.

And as you pointed out in another email it's a different way of doingsomething just to be different.

Then I thought - route(4) behaviour shouldn't be anything special. Whatshould any socket do when it's internal buffer overflows? A quick searchshows that this error case is documented by POSIX, implemented by otherOS's (plural) and our code XXX commentary says we should be doingsomething about it but currently wasn't. Now it is.


In my view, the correct solution is to use ENOBUFS.

Roy

Follow-Ups:
- Re: New class of receive error
  - From: Michael van Elst

References:
- Re: New class of receive error
  - From: Roy Marples
- New class of receive error
  - From: Michael van Elst
- Re: New class of receive error
  - From: Jason Thorpe
- Re: New class of receive error
  - From: Roy Marples
- Re: New class of receive error
  - From: Jason Thorpe
- Re: New class of receive error
  - From: Robert Elz

Prev by Date: Re: New class of receive error
Next by Date: Re: New class of receive error
Previous by Thread: Re: New class of receive error
Next by Thread: Re: New class of receive error
Indexes:

Home | Main Index | Thread Index | Old Index