Subject: pkg/34084: snmpd enters endless loop when using proc, procFix, prErrFix
To: None <pkg-manager@netbsd.org, gnats-admin@netbsd.org,>
From: Scott Presnell <srp@tworoads.net>
List: pkgsrc-bugs
Date: 07/25/2006 17:15:00
>Number: 34084
>Category: pkg
>Synopsis: snmpd enters endless loop when trying to restart a process via proc, procFix, and prErrFix
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: pkg-manager
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jul 25 17:15:00 +0000 2006
>Originator: Scott Presnell
>Release: NetBSD 3.0_STABLE
>Organization:
Self
>Environment:
System: NetBSD dirt.tworoads.net 3.0_STABLE NetBSD 3.0_STABLE (SAAR.MP) #2: Sun Jul 16 13:46:10 PDT 2006 srp@dirt.tworoads.net:/usr/src/sys/arch/i386/compile/SAAR.MP i386
Architecture: i386
Machine: i386
Package: net-snmp-5.3.0.1nb2 Extensible SNMP implementation from pkgsrc-2006Q1, but
also tried net-snmp-5.3.1 from external source distribution. Last
known working version: net-snmp-5.2.1.2nb1 (with NetBSD 3.0_STABLE).
>Description:
If snmpd is configured to watch a process and use a procfix command, when
that process dies, and snmp attempts a restart, snmpd loops forever over
a select call (e.g. blocking on that loop so it doesn't respond to other
clients). From the snmpd full debugging log:
exec:get_exec_output: calling /etc/rc.d/x10s restart
trace: run_exec_command(): mibgroup/utilities/execute.c, 237:
run:exec: running '/etc/rc.d/x10s restart'
trace: run_exec_command(): mibgroup/utilities/execute.c, 319:
verbose:run:exec: waiting for child 25939...
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 357:
verbose:run:exec: read 45 bytes
trace: run_exec_command(): mibgroup/utilities/execute.c, 392:
verbose:run:exec: 15954 left in buffer
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 357:
verbose:run:exec: read 15 bytes
trace: run_exec_command(): mibgroup/utilities/execute.c, 392:
verbose:run:exec: 15939 left in buffer
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 343:
verbose:run:exec: timeout
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 343:
verbose:run:exec: timeout
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 343:
verbose:run:exec: timeout
trace: run_exec_command(): mibgroup/utilities/execute.c, 331:
verbose:run:exec: calling select
trace: run_exec_command(): mibgroup/utilities/execute.c, 343:
>How-To-Repeat:
1) configure snmpd.conf with a proc and procfix line like so:
proc x10
procfix x10 /etc/rc.d/x10s start
2) start snmpd, start script (x10s).
3) Kill off watched proc (x10).
4) Run script that checks for listed processes and attempts to set
prErrFix to 1 for a given instance of prIndex (see attached).
Setting prErrFix to 1 asks snmpd to attempt a restart via procfix
directive.
Results:
a) snmpd will be looping and unresponsive.
b) sometimes multiple procs get started.
>Fix:
Unknown.
Attached script for tracking and restarting procs via snmpd:
#!/usr/pkg/bin/perl
#
# Monitor processes via SNMP
# (based on process.monitor by Brian Moore)
#
# Modified Oct 2001 by Dan Urist <durist@world.std.com>
# Changes: added usage, SNMP v.3 support, -P processes option
# unique-ified errors
#
# Modified Feb 2002 by Dan Urist <durist@world.std.com>
# Changes: added -C config file option; cleaned up code
#
# This script will exit with value 1 if any prErrorFlag is greater
# than 0. The summary output line will be the host names and
# processes that failed in the format host1:proc1,proc2;host2:proc3...
# The detail lines are what UCD snmp returns for a prErrMessage. If
# there is an SNMP error (either a problem with the SNMP libraries, or
# a problem communicating via SNMP with the destination host), this
# script will exit with a warning value of 2. If the -P process list
# option is used, only the listed processes will be monitored. If a
# process given with -P is not being monitored, the script will exit
# with a warning and a value of 2.
#
#
# Copyright (C) 2001 Daniel J. Urist <durist@world.std.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
#
use SNMP;
use Getopt::Std;
$ENV{'MIBS'} = "UCD-SNMP-MIB";
getopts("hP:R" . &SNMPconfig("getopts"));
my $VERSION = "0.3";
if( $opt_h || (scalar @ARGV == 0) ){
print join("\n",
"$0 Version $VERSION; original version by Brian Moore,",
"SNMP v.3 support by Daniel J. Urist <durist\@world.std.com>.",
"\n",
);
print "Usage: $0 OPTIONS host [host ...]\n";
print "Options:\n";
print join("\n\t",
"\t-h # Usage",
"[-P proc[,proc...]]] # Processes to look for",
&SNMPconfig("usage"), "\n");
exit 2;
}
# Get SNMP options
my %SNMPARGS = &SNMPconfig;
# Get process list
my @Processes = split(',', $opt_P) if defined $opt_P;
my $restart = 1 if defined $opt_R;
my $RETVAL = 0;
my %Failures;
my %Longerr;
my %Restarts;
my $Session;
foreach $host (@ARGV) {
$Session = new SNMP::Session(
DestHost => $host,
%SNMPARGS,
);
unless( defined($Session) ) {
$RETVAL = 2 if $RETVAL == 0; # Other errors take precedence over SNMP error
push @{$Failures{$host}}, "session error";
$Longerr{"$host could not get SNMP session"} = "";
next;
}
my $v = new SNMP::Varbind (["prIndex"]);
$Session->getnext ($v);
my @Found;
while (!$Session->{"ErrorStr"} && $v->tag eq "prIndex") {
my @q = $Session->get ([
["prNames", $v->iid], # 0
["prMin", $v->iid], # 1
["prMax", $v->iid], # 2
["prCount", $v->iid], # 3
["prErrorFlag", $v->iid], # 4
["prErrMessage", $v->iid], # 5
["prErrFix", $v->iid], # 6
["prErrFixCmd", $v->iid], # 7
]);
last if ($Session->{"ErrorStr"});
if(@Processes){
if( grep(/^$q[0]$/, @Processes) ){
# Keep track of which processes from the list we actually found
push(@Found, $q[0]);
}
else{
$Session->getnext ($v);
next;
}
}
if ($q[4] > 0) {
$RETVAL = 1;
push @{$Failures{$host}}, $q[0];
$Longerr{"$host:$q[0] Count=$q[3] Min=$q[1] Max=$q[2]"} = "";
# If there's a restart command, and we've asked to restart:
if ($q[7] && $restart) {
if (! $Session->set([
["prErrFix", $v->iid , '1', 'INTEGER']
])) {
print $Session->{ErrorStr}, ":", $Session->{ErrorInd}, "\n";
} else {
$Restarts{"$host:$q[0] issued restart ($q[7])"} = "";
}
}
}
$Session->getnext ($v);
}
if ($Session->{"ErrorStr"}) {
$RETVAL = 2 if $RETVAL == 0; # Other errors take precedence over SNMP error
push @{$Failures{$host}}, "SNMP error";
$Longerr{"$host returned an SNMP error: " . $Session->{"ErrorStr"}} = "";
}
if(@Processes){
my $p;
foreach $p (@Processes){
if( !grep(/^$p$/, @Found)){
$RETVAL = 2 if $RETVAL == 0;
push @{$Failures{$host}}, "process \"$p\" not monitored";
$Longerr{"process \"$p\" not monitored on host $host"} = "";
}
}
}
}
if (scalar keys %Failures) {
# my $f;
# my @m;
# foreach $f (keys %Failures){
# push(@m, $f . ":" .join(",", @{$Failures{$f}}));
# }
# print join(";", @m), "\n\n";
print join ("\n", sort keys %Longerr), "\n" if (%Longerr);
print join ("\n", sort keys %Restarts), "\n" if (%Restarts);
}
exit $RETVAL;
#
# Manage the standard SNMP options
# Arguments are same as netsnmp utils
#
# If called with "getopts", returns a string for "getopts"
# If called with "usage", returns an array of usage information
# Otherwise, returns a hash of SNMP config vars
#
# Overloading this sub like this is kinda hoakey,
# but keeps everything in one place
sub SNMPconfig {
my($action) = @_;
if($action eq "getopts"){
return "c:C:t:r:p:v:u:l:A:e:E:n:a:x:X:";
}
elsif($action eq "usage"){
return(
"[-C configfile] # SNMP vars config file",
"[-t Timeout] # Timeout in ms (default: 1000000)",
"[-r Retries] # Retries before failure (default: 5)",
"[-p RemotePort] # Remote UDP port (default 161)",
"[-v Version] # 1,2,2c or 3 (default: 1)",
"[-c Community] # v.1,2,2c Community Name (default: public)",
"[-u SecName] # v.3 Security Name (default: initial)",
"[-l SecLevel] # v.3 Security Level (default: noAuthNoPriv)",
"[-A AuthPass] # v.3 Authentication Passphrase (default: none)",
"[-e SecEngineId] # v.3 security engineID (default: none)",
"[-E ContextEngineId] # v.3 context engineID (default: none)",
"[-n Context] # v.3 context name (default: none)",
"[-a AuthProto] # authentication protocol (MD5|SHA; default MD5)",
"[-x PrivProto] # privacy protocol (DES)",
"[-X PrivPass] # privacy passphrase (default: none)",
);
}
# Read config file
my %Conf;
if($opt_C){
unless( open(CONF, $opt_C) ){
print "$0: Could not open config file $opt_C\n";
exit 2;
}
my $line;
my @fields;
foreach $line (<CONF>){
chomp $line;
@fields = split(/=/, $line);
$Conf{ lc $fields[0] } = $fields[1];
}
close CONF;
}
my %SNMPARGS;
# Common options
$SNMPARGS{Timeout} = $opt_t || $Conf{timeout} || 1000000;
$SNMPARGS{Retries} = $opt_r || $Conf{retries} || 5;
$SNMPARGS{RemotePort} = $opt_p || $Conf{remoteport} || 161;
$SNMPARGS{Version} = $opt_v || $Conf{version} || 1;
# v. 3 options
if ($SNMPARGS{Version} eq "3"){
$SNMPARGS{SecName} = $opt_u || $Conf{secname} || 'initial';
$SNMPARGS{SecLevel} = $opt_l || $Conf{seclevel} || 'noAuthNoPriv';
$SNMPARGS{AuthPass} = $opt_A || $Conf{authpass} || '';
$SNMPARGS{SecEngineId} = $opt_e || $Conf{secengineid} || '';
$SNMPARGS{ContextEngineId} = $opt_E || $Conf{contextengineid} || '';
$SNMPARGS{Context} = $opt_n || $Conf{context} || '';
$SNMPARGS{AuthProto} = $opt_a || $Conf{authproto} || '';
$SNMPARGS{PrivProto} = $opt_x || $Conf{privproto} || '';
$SNMPARGS{PrivPass} = $opt_X || $Conf{privpass} || '';
}
# v. 1,2 options
else{
$SNMPARGS{Community} = $opt_c || $Conf{community} || 'public';
}
return %SNMPARGS;
}