Re: so_rerror

To: Robert Elz <kre%munnari.OZ.AU@localhost>
Subject: Re: so_rerror
From: Roy Marples <roy%marples.name@localhost>
Date: Wed, 7 Nov 2018 12:10:53 +0000

On 07/11/2018 05:50, Robert Elz wrote:

     Date:        Sun, 4 Nov 2018 21:02:26 +0000
     From:        Roy Marples <roy%marples.name@localhost>
     Message-ID:  <d3c2a03d-357a-89d1-c4b4-bbc8ba9c26bd%marples.name@localhost>

   | Whether it arrived at the kernel by UDP or carrier pigeon and could not
   | be delivered for reason X, we should not be discarding this silently.

The error was never silently disacrded, it was always counted, and
available from netstat -m


It's clear that you don't care about auditability or traceability.
If I have any things running, which one was it discarded for and why?


The pushback is against informing applications which cannot rationally
do anything about it .in general ... there is no way to prevent an
occasional overflow when ciccumstances conspire to make a very
large number of messages all arrive at once - all making the buffer
bigger does is to allow more of all of that to be queued (removing the
logging from the time of event more than it should be) and meaning that
an even bigger buffer perhaps gets allocated, to handle once in a
century type events.


There isn't much I can do about an ENOSYS error either.
Maybe we should silently discard those too?


   | However, I do buy into the argument that syslogd can't keep up with
   | incoming data in all situations. To facilitate this, I've added the -B
   | option so you can specify a large buffer.

That's usually going to be the wrong thing to do.   There might be some
very busy syslog servers where the default buffer size is simply not
enough, and for those, this is a reasonable solution.   But for most,
overreacting to an occaional spike is not a good solution.

| Also, it might be that the system is just to slow to log the amount of

   | incoming data so I've added the -X option so ENOBUFS can be silently
   | discarded.

That's good, but even better would be to not bother syslogd with the
"error" in the first place.

   | So as of right now, the admin can see overflow and they can make the
   | choice about how to handle it. Surely you must agree that this is a good
   | default rather than leaving the admin to worry if their logger actually
   | logged everything or not.

There is no way not to worry - syslog messages can be relayed over
normal udp from host to host - and can be dropped anywhere.   All
this is doing is catching one odd case of lost messages - allowing the
admin to believe that if they see no "buffer overflow" messages then
that means that they're not losing any messages is irresponsible.

   | And still dhcpcd reports overruns before we increased the size of the
   | buffers. It still does, but only on my router and only at boot time, but
   | thankfully it now has the code resync itself to the real system state.

The routing socket (and its clone, when we get it, the mobile-ip socket)
is special - probably should not be a socket at all, but some other kind
of entity (though inventing something new is also not necessarily the
best thing to do.)   For that one, the buffer overflow message is useful,
bith because that is (I suspect) the only way that messages can be lost,
and because the recipient has a way to recover (expensive, but possible)
when it happens.   Had all of this mechanism been confined to the
routing socket, there never would have been a problem.


You know what?
I no longer care.

I no longer care that I spent months figuring out a long standing issuearound stuff that packets were being lost in a self contained systembecause the important error WAS NOT BEING REPORTED.

I no longer care to inform the admin that their logging server justcan't keep up with demand and pretend that life is dandy.

Feel free to rip my code out and put back all the comments saying XXXreport this to userland.I'm done with this shit that wants to make a developers life harder andnot easier.

Roy

Follow-Ups:
- Re: so_rerror
  - From: Christos Zoulas

References:
- Re: so_rerror
  - From: Roy Marples
- Re: so_rerror
  - From: Christos Zoulas
- Re: so_rerror
  - From: Robert Elz

Prev by Date: Re: altq on a gif tunnel
Next by Date: Re: so_rerror
Previous by Thread: Re: so_rerror
Next by Thread: Re: so_rerror
Indexes:

Home | Main Index | Thread Index | Old Index