tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: New class of receive error





On 15/05/2018 23:43, Robert Elz wrote:
     Date:        Tue, 15 May 2018 20:45:20 +0100
     From:        Roy Marples <roy%marples.name@localhost>
     Message-ID:  <a5b1981d-789d-17c7-bab5-c07afb6c24a7%marples.name@localhost>

   | On 13/05/2018 13:13, Jason Thorpe wrote:
   | > If they get a new error code in a new situation, they don’t.
   |
   | I've thought about this for some time, and respectfully I disagree.

If I am not mistaken, was not the issue in the PR that started this
discussion (53247) exactly that dhcpcd did not correctly handle this
new error code?

Isn't that, of itself, evidence that Jason is correct?

That is not true.
It reported the error condition and re-synced state.
The fact it had to re-sync state is not optimal in the slightest, so it logged a diagnostic.


   | A new error code (remember, it's just errno being set not the return
   | value) is no different from a new RTM_* message being added and no-one
   | has claimed thus far that breaks binary or source compatibility.

No, but that's because getting unknown message types has always been
something that applications have expected - they explicitly look for the
ones they want to process, and ignore all others.

That's false.

I've modified at least ifwatchd, wpa_supplicant and maybe racoon and a handful of packages in pkgsrc to ignore RTM_* messages it didn't want or understand - the default was to log unknown message. Even before I joined the project, apps were logging RTM_OINFO as not understood. Check cvs history.


If appliucations explicitly looked for errno values they understand, and
ignored all others, then the analogy might be germane, but they don't.

In fact, as I said in an earlier message, unless the application deliberately
enabled non-blocking mode, or is catching signals and continuing, none of
the errors from recv() were things that an application would normally
expect to continue from - if any of the previously defined errors occurred
(other than in those two cases) it is entirely reaosnable for the application
to simply log an error and quit.   That's not appropriate handling for your
new error code, rather an app that has nothing better to do with it should
simply ignore it and retry the recv*().

This is false.

Firstly, you're basing this around the premise that an applications sole purpose is to read from one socket. Speaking for dhcpcd at least, if the DHCPv6 socket fails the others for DHCPv6 and ICMP6 will keep chugging away.

Secondly, you're assuming that just like with RTM_* above, the application will abort on stuff it isn't exepcting. From what I've seen this isn't the case - mostly things will just log a diagnostic and continue working. This is true of every RTM_* message I've fixed and of every reported overflow message thus far (with ironically the exception of dhcpcd which is now fixed).

Thirdly, your entire premise stems around the fact that no new error codes will ever be added to any public funciton which I'm pretty sure has happened in the past and will happen in the future.

Good code should cater for the unexpected rather than aborting.
After all, who knows what the future will bring?

ps: has the fix for 53247 (the real fix, not the buffer size increases) been
committed to -current yet?   If not, why?  If it has, has a pullup for -8 been
requested yet?

Yes and yes. I just forgot to note the PR in the dhcpcd import so it's not logged against the ticket.

Roy


Home | Main Index | Thread Index | Old Index