Source-Changes-HG archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

[src/netbsd-6]: src/usr.sbin/ypbind Pull up following revision(s) (requested ...



details:   https://anonhg.NetBSD.org/src/rev/f811317aa365
branches:  netbsd-6
changeset: 776696:f811317aa365
user:      msaitoh <msaitoh%NetBSD.org@localhost>
date:      Tue Sep 09 08:24:29 2014 +0000

description:
Pull up following revision(s) (requested by dholland in ticket #1083):
        usr.sbin/ypbind/ypbind.c: revision 1.91
        usr.sbin/ypbind/ypbind.c: revision 1.92
        usr.sbin/ypbind/ypbind.c: revision 1.93
        usr.sbin/ypbind/ypbind.c: revision 1.94
        usr.sbin/ypbind/ypbind.c: revision 1.95
        usr.sbin/ypbind/ypbind.c: revision 1.96
        usr.sbin/ypbind/ypbind.c: revision 1.97
        usr.sbin/ypbind/ypbind.c: revision 1.98
        usr.sbin/ypbind/ypbind.8: revision 1.20
        usr.sbin/ypbind/ypbind.8: revision 1.19
Don't store the default domain name in a global. While running we
really don't care which domain is the system's default domain.
Factor out some rpc validation code.
While there are times it's appropriate to call a state variable
"evil", this isn't one of them. Since the logic involved is to wait
until the default domain binds before backgrounding, call the variable
"started" instead.
Don't rake up the default domain until after processing arguments.
Processing arguments just sets flags -- may as well do it first, and
this way detection of silly errors isn't contingent on having things
fully configured and operating.
Load up with comments.
Instead of using magic numbers in what looks like a boolean
(dom_alive), create a state enumeration (domainstates) and use it
instead.
Instead of three states (new, alive, and, effectively, 'troubled') go
to five: new, alive, pinging, lost, and dead.
Domains start in the NEW state. When we get a reply from a server, the
state goes to ALIVE. The state is set to PINGING when we ping the
server (once a minute normally) and if the ping times out, it goes to
LOST. If we stay lost for a minute, go to DEAD, and in DEAD, do
exponential backoff of nag_servers calls.
Getting rid of the broken logic attached to the 'troubled' state fixes
PR 15355 (ypbind defeats disk idle spindown) -- it will now only
rewrite the binding file when the binding changes.
Also, fix the HEURISTIC code so it doesn't trigger except in ALIVE
state. I think this was the source of a lot of the spamming behavior
seen in PR 32519, which is now fixed.
Might also fix PR 23135 (broadcast ypbind sometimes fails to find
servers).
Add a SIGHUP handler; upon SIGHUP do an extra nag_servers on any
domain that's in DEAD state. This lets you explicitly rescue ypbind
from its exponential backoff when you know the world's back up.
Log state transitions.
Don't store the default domain name in a global. While running we
really don't care which domain is the system's default domain.
Factor out some rpc validation code.
While there are times it's appropriate to call a state variable
"evil", this isn't one of them. Since the logic involved is to wait
until the default domain binds before backgrounding, call the variable
"started" instead.
Don't rake up the default domain until after processing arguments.
Processing arguments just sets flags -- may as well do it first, and
this way detection of silly errors isn't contingent on having things
fully configured and operating.
Load up with comments.
Instead of using magic numbers in what looks like a boolean
(dom_alive), create a state enumeration (domainstates) and use it
instead.
Instead of three states (new, alive, and, effectively, 'troubled') go
to five: new, alive, pinging, lost, and dead.
Domains start in the NEW state. When we get a reply from a server, the
state goes to ALIVE. The state is set to PINGING when we ping the
server (once a minute normally) and if the ping times out, it goes to
LOST. If we stay lost for a minute, go to DEAD, and in DEAD, do
exponential backoff of nag_servers calls.
Getting rid of the broken logic attached to the 'troubled' state fixes
PR 15355 (ypbind defeats disk idle spindown) -- it will now only
rewrite the binding file when the binding changes.
Also, fix the HEURISTIC code so it doesn't trigger except in ALIVE
state. I think this was the source of a lot of the spamming behavior
seen in PR 32519, which is now fixed.
Might also fix PR 23135 (broadcast ypbind sometimes fails to find
servers).
Add a SIGHUP handler; upon SIGHUP do an extra nag_servers on any
domain that's in DEAD state. This lets you explicitly rescue ypbind
from its exponential backoff when you know the world's back up.
Log state transitions.
Document exponential backoff behavior and SIGHUP support, plus a couple
other minor edits.
Use more markup.

diffstat:

 usr.sbin/ypbind/ypbind.8 |   32 +-
 usr.sbin/ypbind/ypbind.c |  660 ++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 616 insertions(+), 76 deletions(-)

diffs (truncated from 1146 to 300 lines):

diff -r 67e7bbc5349e -r f811317aa365 usr.sbin/ypbind/ypbind.8
--- a/usr.sbin/ypbind/ypbind.8  Wed Aug 27 15:09:57 2014 +0000
+++ b/usr.sbin/ypbind/ypbind.8  Tue Sep 09 08:24:29 2014 +0000
@@ -1,4 +1,4 @@
-.\"    $NetBSD: ypbind.8,v 1.18 2008/04/30 13:11:03 martin Exp $
+.\"    $NetBSD: ypbind.8,v 1.18.22.1 2014/09/09 08:24:29 msaitoh Exp $
 .\"
 .\" Copyright (c) 1996 The NetBSD Foundation, Inc.
 .\" All rights reserved.
@@ -27,7 +27,7 @@
 .\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 .\" POSSIBILITY OF SUCH DAMAGE.
 .\"
-.Dd February 26, 2005
+.Dd June 14, 2014
 .Dt YPBIND 8
 .Os
 .Sh NAME
@@ -94,9 +94,9 @@
 If the binding is somehow lost, e.g by server reboot,
 .Nm
 marks the domain as unbound and attempts to re-establish the binding.
-When the binding is once again successful,
+If a binding cannot be re-established within 60 seconds,
 .Nm
-marks the domain as bound and resumes its periodic check.
+backs off exponentially to trying only once per hour.
 .Pp
 The options are as follows:
 .Bl -tag -width "-broadcast"
@@ -114,7 +114,7 @@
 servers.
 .It Fl ypset
 .Xr ypset 8
-may be used to change the server to which a domain is bound.
+may be used from anywhere to change the server to which a domain is bound.
 .It Fl ypsetme
 .Xr ypset 8
 may be used only from this machine to change the server
@@ -122,11 +122,22 @@
 .El
 .Pp
 The
-.Fl broadcast
+.Fl broadcast ,
 .Fl ypset ,
 and
-.Fl ypsetme ,
+.Fl ypsetme
 options are inherently insecure and should be avoided.
+.Sh SIGNALS
+.Nm
+responds to the following signals:
+.Bl -tag -width TERM -compact
+.It Dv HUP
+causes
+.Nm
+to immediately retry any unbound domains that are currently in
+exponential backoff.
+Use this to resume immediately after a long network outage is
+resolved.
 .Sh FILES
 .Pa /var/yp/binding/\*[Lt]domain\*[Gt].version
 - binding file for \*[Lt]domain\*[Gt].
@@ -147,7 +158,10 @@
 .Xr yppoll 8 ,
 .Xr ypset 8
 .Sh AUTHORS
+.An -nosplit
 This version of
 .Nm
-was originally implemented by Theo de Raadt.
-The ypservers support was implemented by Luke Mewburn.
+was originally implemented by
+.An Theo de Raadt .
+The ypservers support was implemented by
+.An Luke Mewburn .
diff -r 67e7bbc5349e -r f811317aa365 usr.sbin/ypbind/ypbind.c
--- a/usr.sbin/ypbind/ypbind.c  Wed Aug 27 15:09:57 2014 +0000
+++ b/usr.sbin/ypbind/ypbind.c  Tue Sep 09 08:24:29 2014 +0000
@@ -1,4 +1,4 @@
-/*     $NetBSD: ypbind.c,v 1.90 2011/08/30 17:06:22 plunky Exp $       */
+/*     $NetBSD: ypbind.c,v 1.90.4.1 2014/09/09 08:24:29 msaitoh Exp $  */
 
 /*
  * Copyright (c) 1992, 1993 Theo de Raadt <deraadt%fsa.ca@localhost>
@@ -28,7 +28,7 @@
 
 #include <sys/cdefs.h>
 #ifndef LINT
-__RCSID("$NetBSD: ypbind.c,v 1.90 2011/08/30 17:06:22 plunky Exp $");
+__RCSID("$NetBSD: ypbind.c,v 1.90.4.1 2014/09/09 08:24:29 msaitoh Exp $");
 #endif
 
 #include <sys/types.h>
@@ -50,6 +50,7 @@
 #include <ifaddrs.h>
 #include <limits.h>
 #include <netdb.h>
+#include <signal.h>
 #include <stdarg.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -84,16 +85,26 @@
        YPBIND_DIRECT, YPBIND_BROADCAST,
 } ypbind_mode_t;
 
+enum domainstates {
+       DOM_NEW,                /* not yet bound */
+       DOM_ALIVE,              /* bound and healthy */
+       DOM_PINGING,            /* ping outstanding */
+       DOM_LOST,               /* binding timed out, looking for a new one */
+       DOM_DEAD,               /* long-term lost, in exponential backoff */
+};
+
 struct domain {
        struct domain *dom_next;
 
        char dom_name[YPMAXDOMAIN + 1];
        struct sockaddr_in dom_server_addr;
        long dom_vers;
-       time_t dom_checktime;
-       time_t dom_asktime;
+       time_t dom_checktime;           /* time of next check/contact */
+       time_t dom_asktime;             /* time we were last DOMAIN'd */
+       time_t dom_losttime;            /* time the binding was lost, or 0 */
+       unsigned dom_backofftime;       /* current backoff period, when DEAD */
        int dom_lockfd;
-       int dom_alive;
+       enum domainstates dom_state;
        uint32_t dom_xid;
        FILE *dom_serversfile;          /* /var/yp/binding/foo.ypservers */
        int dom_been_ypset;             /* ypset been done on this domain? */
@@ -102,26 +113,36 @@
 
 #define BUFSIZE                1400
 
-static char *domainname;
-
+/* the list of all domains */
 static struct domain *domains;
 static int check;
 
+/* option settings */
 static ypbind_mode_t default_ypbindmode;
-
 static int allow_local_ypset = 0, allow_any_ypset = 0;
 static int insecure;
 
+/* the sockets we use to interact with servers */
 static int rpcsock, pingsock;
+
+/* stuff used for manually interacting with servers */
 static struct rmtcallargs rmtca;
 static struct rmtcallres rmtcr;
 static bool_t rmtcr_outval;
 static unsigned long rmtcr_port;
+
+/* The ypbind service transports */
 static SVCXPRT *udptransp, *tcptransp;
 
+/* set if we get SIGHUP */
+static sig_atomic_t hupped;
+
 ////////////////////////////////////////////////////////////
 // utilities
 
+/*
+ * Combo of open() and flock().
+ */
 static int
 open_locked(const char *path, int flags, mode_t mode)
 {
@@ -138,6 +159,39 @@
        return fd;
 }
 
+/*
+ * Exponential backoff for pinging servers for a dead domain.
+ *
+ * We go 10 -> 20 -> 40 -> 60 seconds, then 2 -> 4 -> 8 -> 15 -> 30 ->
+ * 60 minutes, and stay at 60 minutes. This is overengineered.
+ *
+ * With a 60 minute max backoff the response time for when things come
+ * back is not awful, but we only try (and log) about 60 times even if
+ * things are down for a whole long weekend. This is an acceptable log
+ * load, I think.
+ */
+static void
+backoff(unsigned *psecs)
+{
+       unsigned secs;
+
+       secs = *psecs;
+       if (secs < 60) {
+               secs *= 2;
+               if (secs > 60) {
+                       secs = 60;
+               }
+       } else if (secs < 60 * 15) {
+               secs *= 2;
+               if (secs > 60 * 15) {
+                       secs = 60 * 15;
+               }
+       } else if (secs < 60 * 60) {
+               secs *= 2;
+       }
+       *psecs = secs;
+}
+
 ////////////////////////////////////////////////////////////
 // logging
 
@@ -150,6 +204,9 @@
 
 static void yp_log(int, const char *, ...) __printflike(2, 3);
 
+/*
+ * Log some stuff, to syslog or stderr depending on the debug setting.
+ */
 static void
 yp_log(int pri, const char *fmt, ...)
 {
@@ -187,6 +244,34 @@
 ////////////////////////////////////////////////////////////
 // struct domain
 
+/*
+ * The state transitions of a domain work as follows:
+ *
+ * in state NEW:
+ *    nag_servers every 5 seconds
+ *    upon answer, state is ALIVE
+ *
+ * in state ALIVE:
+ *    every 60 seconds, send ping and switch to state PINGING
+ *
+ * in state PINGING:
+ *    upon answer, go to state ALIVE
+ *    if no answer in 5 seconds, go to state LOST and do nag_servers
+ *
+ * in state LOST:
+ *    do nag_servers every 5 seconds
+ *    upon answer, go to state ALIVE
+ *    if no answer in 60 seconds, go to state DEAD
+ *
+ * in state DEAD
+ *    do nag_servers every backofftime seconds (starts at 10)
+ *    upon answer go to state ALIVE
+ *    backofftime doubles (approximately) each try, with a cap of 1 hour
+ */
+
+/*
+ * Look up a domain by the XID we assigned it.
+ */
 static struct domain *
 domain_find(uint32_t xid)
 {
@@ -198,6 +283,11 @@
        return dom;
 }
 
+/*
+ * Pick an XID for a domain.
+ *
+ * XXX: this should just generate a random number.
+ */
 static uint32_t
 unique_xid(struct domain *dom)
 {
@@ -210,6 +300,10 @@
        return tmp_xid;
 }
 
+/*
+ * Construct a new domain. Adds it to the global linked list of all
+ * domains.
+ */
 static struct domain *
 domain_create(const char *name)
 {
@@ -230,8 +324,10 @@
        dom->dom_vers = YPVERS;
        dom->dom_checktime = 0;
        dom->dom_asktime = 0;
+       dom->dom_losttime = 0;
+       dom->dom_backofftime = 10;
        dom->dom_lockfd = -1;
-       dom->dom_alive = 0;
+       dom->dom_state = DOM_NEW;
        dom->dom_xid = unique_xid(dom);
        dom->dom_been_ypset = 0;
        dom->dom_serversfile = NULL;
@@ -265,6 +361,10 @@
 ////////////////////////////////////////////////////////////
 // locks
 
+/*
+ * Open a new binding file. Does not write the contents out; the
+ * caller (there's only one) does that.



Home | Main Index | Thread Index | Old Index