Subject: Re: bin/11087: syslogd not working, HUP doesn't fix, requires hard restart
To: None <gnats-bugs@gnats.netbsd.org, netbsd-bugs@netbsd.org>
From: Dave Olson <olson@bengaltech.com>
List: netbsd-bugs
Date: 09/26/2000 23:31:48
Greg A. Woods wrote:
| Long-running local processes that only call openlog() once at their
| initialisation will not continue to log after syslogd has been restarted
| because they use local-domain sockets to communicate with syslogd and it
| would seem that there's no existing mechanism in use to signal when a
| the syslogd has gone away, and since syslogd does the further injustice
| of re-creating the pathname representing the local-domain socket
| (i.e. _PATH_LOG, or /var/run/log) every time it starts up, it's probably
| necessary to re-open the client-side socket too. Note how different
| this is semantically from normal "datagram" style sockets where you can
| always just blindy spew stuff out and hope somone is still listening!
|
| Maybe fixing the latter could be as simple as this (this is pure
| un-tested speculation though):
Here's the change we implemented at Geocast for programs that
call openlog(), but might have syslogd restarted. It's against
the 1999-11-03 version of syslog.c. We have a fair amount of
testing on this version. Similar to yours, but a bit different.
Index: syslog.c
===================================================================
RCS file: lib/libc/gen/syslog.c,v
retrieving revision 1.1.1.8
retrieving revision 1.2
diff -u -r1.1.1.8 -r1.2
--- syslog.c 1999/11/04 20:21:51 1.1.1.8
+++ syslog.c 2000/07/19 02:16:56 1.2
@@ -135,7 +135,10 @@
char *stdp = NULL; /* pacify gcc */
char tbuf[TBUF_LEN], fmt_cpy[FMT_LEN];
size_t tbuf_left, fmt_left, prlen;
+ int firsttry;
+ firsttry = 1;
+
#define INTERNALLOG LOG_ERR|LOG_CONS|LOG_PERROR|LOG_PID
/* Check for invalid bits. */
if (pri & ~(LOG_PRIMASK|LOG_FACMASK)) {
@@ -246,12 +249,25 @@
/* Get connected, output the message to the local logger. */
mutex_lock(&syslog_mutex);
- if (!connected)
+retry:
+ if (!connected) {
openlog_unlocked(LogTag, LogStat | LOG_NDELAY, 0);
+ firsttry--; /* no point in retrying open if we just did it */
+ }
if (send(LogFile, tbuf, cnt, 0) >= 0) {
mutex_unlock(&syslog_mutex);
return;
}
+ if(firsttry>0 && connected) {
+ /* if send() failed, and we already had a connection open, one likely
+ * cause is that syslogd exit'ed. It may have been restarted since, so
+ * try once to open a new connection. This could be made conditional
+ * on particular errno values, but the potential list is long, and changes
+ * over time even within OSes. (Geocast bug 1884) */
+ closelog_unlocked();
+ firsttry = 0;
+ goto retry;
+ }
mutex_unlock(&syslog_mutex);
/*