Re: ping: sendto: No buffer space available

To: tech-net%NetBSD.org@localhost
Subject: Re: ping: sendto: No buffer space available
From: "Erik E. Fair" <fair%netbsd.org@localhost>
Date: Mon, 10 Oct 2011 10:05:14 -0700

I'd like to clarify my thinking here. We're talking both aboutnetworking, in which we commonly lose data (sometimes by design,e.g. in network congestion situations at IP routers), and we'retalking about UNIX kernel system call API.

In my opinion, when the kernel is interacting with a userlandapplication, it should *never* silently drop (lose) data.

The UNIX kernel should endeavor to provide the userland with asmuch information about what's going on as possible, so that theapplication programmer can decide what to do, as appropriate tohis application. Defaults should be set to match reasonableexpectations, and exceptions allowed for programs that are preparedto handle a wider range of error conditions themselves.

In most resource limitation situations (whether the resource islimited a priori, or simply exhausted), only the kernel has theglobal view of the resource (if there is a global view to be had).So the question is: what does it do (or tell the application) inthat situation?

There are different models for this, depending upon the resourceand the nature of the limitation.

When a filesystem fills up, generally write(2) returns ENOSPC.Usually, these situations require operator (human) intervention toclean up, i.e. someone has to start freeing up disk space, basedone presumes in some local policy about what files or data isexpendable (or transferable to elsewhere). ENOSPC is the kernelsaying, "there is no more data storage space, and I have noexpectation that any will become available 'soon' - deal with it,application program."

Most programs don't bother to check for that error, and have nocode to handle it - they just keep blithely banging away at write(2),or they fail outright on any error without specifically handlingthat one.

If the kernel simply blocked userland programs until filesystemspace became available, RAM could potentially fill up with buffers(inside programs) waiting to be written. Possibly the system couldseize up completely, preventing effective operator response. Thereis a presumption in UNIX that there should always be some diskspace available. NetBSD provides newsyslog(8) and a default/etc/newsyslog.conf for this reason.

By contrast, when a TCP stream is flow controlled, that's temporary,and resource can reasonably be expected to be available againshortly, as a matter of course in normal operation. So, the kernelwill block a write(2) call until flow is allowed again, rather thanreturn an error.

In UDP (or other datagram protocol situation, e.g. ICMP), thereare no flow controls as such - IP network routers are designed (andthus expected) to drop packets when they're congested, and applicationsusing those protocols are expected to deal with that as appropriate(some handle it, some ignore it). Inside the UNIX kernel, we limitthe number of packets the kernel will handle both in the maximumsize of the global mbuf pool and in the maximum output packet queuelength for each network interface.

I complained ten years ago that there was no distinction being madein error messages from the kernel between "no more mbufs" and"network interface output queue full" mostly from a concern foruser confusion as to the specific error situation being encounteredand reported to users. This E-mail thread started (I believe) fromprecisely this confusion.

In both cases, an application is extremely unlikely to have theglobal view of whichever resource (mbufs or output queues) istemporarily exhausted. What is an application programmer to do inthat case? How much backoff or waiting should the program apply?

Given the dynamic nature of networking, it is likely that eitherresource exhaustion (or limitation) is very temporary in nature.That's why I'm suggesting that the kernel should block by default,rather than return ENOBUFS in a network output queue limit situation,and return ENOBUFS only to those applications which have requestednon-blocking behavior (i.e. have explicitly indicated to the kernelthat they're prepared to handle that error condition). Blockinggets the program to shut up for a time (flow control in the faceof local resource exhaustion, but of a resource that is reasonablyexpected to be available again very soon).

Ping(8) is a network test and measurement tool. Clearly, it fallsinto the "will request non-blocking I/O" class, in that in orderfor it to properly report where things are working (or not working),it has to know what's going on.

I want to make clear that the IP router response is different,despite that UNIX can and does act as an IP router. An IP routerhas a tenuous relationship at best to the other systems on itsattached networks. When it hits resource exhaustion, all it can dois drop IP packets - sending a response in a congestion situationonly makes congestion worse; TCP is designed to deal with detectedpacket drops by reducing data flow. That's a very differentrelationship than the one between an application program and an OSkernel within a single system.


        Erik <fair%netbsd.org@localhost>

Follow-Ups:
- Re: ping: sendto: No buffer space available
  - From: Mouse
- Re: ping: sendto: No buffer space available
  - From: Thor Lancelot Simon
- Re: ping: sendto: No buffer space available
  - From: Thor Lancelot Simon

References:
- Re: ping: sendto: No buffer space available
  - From: David Laight
- Re: ping: sendto: No buffer space available
  - From: Thor Lancelot Simon

Prev by Date: Re: ping: sendto: No buffer space available
Next by Date: Re: ping: sendto: No buffer space available
Previous by Thread: Re: ping: sendto: No buffer space available
Next by Thread: Re: ping: sendto: No buffer space available
Indexes:

Home | Main Index | Thread Index | Old Index