Re: so_rerror

To: Christos Zoulas <christos%zoulas.com@localhost>, tech-net%netbsd.org@localhost
Subject: Re: so_rerror
From: Roy Marples <roy%marples.name@localhost>
Date: Sun, 4 Nov 2018 21:02:26 +0000

On 04/11/2018 15:18, Christos Zoulas wrote:

| Can you explain how it was broken and what do you to make it work again?

I turned off logging completely to fix it. Logging ended up taking
up all the I/O cycles because each time logging overflowed syslogd
ended up logging that logging overflowed... This worked just fine
before the changes.

| Which is why we need a better solution than what we have.
| dynamically increasing/decreasing buffer size is a good solution for
| this, which should make everyone happy.

That will never fix the problem; in fact it will make the situation
worse because of bufferbloat, resource consumption on low resource
sysrems, and increased latency. As people have explained numerous
times before this is UDP and you should be prepared to lose packets
(the transport is unreliable). If you want to build a reliable
transport on top of UDP rerror is not enough, you need to use a
packet sequence number or something to detect lost packets.

Yes it is good to detect lost packets when you can so rerror is
generally a good thing, and if it was done on day one it would
probably be fine to keep. I would also be nice to have on by default
eventually, but right now it makes the situation worse than before.

Whether it arrived at the kernel by UDP or carrier pigeon and could notbe delivered for reason X, we should not be discarding this silently.

However, I do buy into the argument that syslogd can't keep up withincoming data in all situations. To facilitate this, I've added the -Boption so you can specify a large buffer.Also, it might be that the system is just to slow to log the amount ofincoming data so I've added the -X option so ENOBUFS can be silentlydiscarded.

So as of right now, the admin can see overflow and they can make thechoice about how to handle it. Surely you must agree that this is a gooddefault rather than leaving the admin to worry if their logger actuallylogged everything or not.


| > Nevertheless now everyone can have it the way the like... There is
| > a sysctl to turn it on globally and a per-socket setsockopt to override.
|
| And we want a secure system where a lot of useful programs don't run and
| sweeps overflow issues under the carpet by default? Not me!

Yes, for the programs that want this behavior. Let us not forget that
this started because of the aberrant behavior of the routing socket
where because of the compatibility messages we ended up overflowing
and losing. Instead of fixing the root cause (don't send compat
stuff to the programs that don't need them -- programs understand only
one version of the messages and throw away the rest), we decided to
detect the dropped packet problem by introducing so_rerror. This
detection could have also be done by using the sequence number, or
a similar id based protocol.


This is actually untrue.

All programs that care about the routing socket have already set socketfilters to avoid the messages (ie compat versions) they don't careabout. These filters run before overflow can happen.And still dhcpcd reports overruns before we increased the size of thebuffers. It still does, but only on my router and only at boot time, butthankfully it now has the code resync itself to the real system state.

Roy

Follow-Ups:
- Re: so_rerror
  - From: Havard Eidnes
- Re: so_rerror
  - From: Christos Zoulas

References:
- Re: so_rerror
  - From: Christos Zoulas

Prev by Date: Re: so_rerror
Next by Date: re: so_rerror
Previous by Thread: Re: so_rerror
Next by Thread: Re: so_rerror
Indexes:

Home | Main Index | Thread Index | Old Index