Subject: sendto(2) EINVAL / pacing problem
To: None <tech-net@netbsd.org>
From: Christoph Kaegi <kgc@zhwin.ch>
List: tech-net
Date: 02/09/2004 14:46:57
Dear NetBSDers

I have been running a small, homegrown distributed monitoring system 
on my (mostly NetBSD) unix hosts for several months now.
It consists of a master on one host and an agent on every watched host.

A state query client asks over a Unix Domain UDP Socket for the
actual state and receives statereplies, one UDP Packet for every 
state monitored.

Today, I added the 172th probe, and now, when the master is sending
the states down the socket, sendto(2) returns EINVAL, when trying to 
send the 172th UDP packet.

The sendto(2) manpage says:

     [EINVAL]      The total length of the I/O is more than can be expressed
                   by the ssize_t return value.

... which doesn't seem to be my problem, as the messages are
constant sized (321 Bytes)

I am thinking, that the problem could be, that the master just 
sends all the states out the (non-blocking) socket and is hitting 
a resource limit somewhere.

So I have the following questions:

- How do I go about pacing sendto(2) on a non-blocking socket?

- Is sendto(2) giving me a misleading/wrong error message? 
  Or is the sendto(2) manpage not concise enough?

Thanks alot

Chris


Here is the code snippet in question:

-------------------------------------- 8< --------------------------------------

	/* Walk through the monProbes linked list and send every probe's status */

	for(mpc = monProbes; mpc != NULL; mpc = mpc->next) {
		if (mpc->Enabled == PROBE_ENABLED) {

			snprintf(dgdata, LEN_MSG_STATUSREPLY, 
				"%2.2s%4d%4d%4d%-64.64s%-25.25s%-64.64s%1d%10d%1d%10d%-128.128s%4d",
				MSG_STATUSREPLY,
				mpc->ProbeID,
				ProbeNumber,
				TotalProbes,
				mpc->HostName,
				mpc->Probe,
				mpc->Arguments,
				mpc->CurrentState,
				(int) mpc->ts_CurrentState,
				mpc->LastState,
				(int) mpc->ts_LastState,
				mpc->StateDescription,
				mpc->FailedCount
			);

			CharsSent = sendto(lsockfd, dgdata, LEN_MSG_STATUSREPLY-1, 0, 
				(struct sockaddr *) &dg->sa.client_sau, dg->dg_salen);

			if (CharsSent == -1) {
				/* HERE it gets, when having sent 171 packets and trying to send the 172th */
				ErrMsg(ERR_RETURN, ERR_PRINTERRNO, 
					"sendStatus: could not send probe status %d of %d (ProbeID=%d)",
					ProbeNumber, TotalProbes, mpc->ProbeID);
				return;
			}

			ProbeNumber++;

		}
	}

-------------------------------------- 8< --------------------------------------

-- 
----------------------------------------------------------------------
Christoph Kaegi                                           kgc@zhwin.ch
----------------------------------------------------------------------