Subject: bin/31502: rpc.statd doesn't save failed notifies
To: None <gnats-admin@netbsd.org, netbsd-bugs@netbsd.org>
From: None <xcc98be0c43465684@f4n.org>
List: netbsd-bugs
Date: 10/07/2005 11:34:00
>Number:         31502
>Category:       bin
>Synopsis:       rpc.statd doesn't save failed notifies
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Oct 07 11:34:00 +0000 2005
>Originator:     John
>Release:        2.0.2
>Organization:
>Environment:
NetBSD x 2.0.2 NetBSD 2.0.2 (GENERIC) #1: Thu Oct  6 16:11:33 CEST 2005  root@x:/usr/src/sys/arch/i386/compile/GENERIC i386
>Description:
There appears to be a bug in rpc.statd/statd.c, notify_one(): 
if (notify_one_host(name)) {
    /* ... */
}
else {
    /* ... */
if (hi->attempts++ >= 44)
    goto give_up;
else if (hi->attempts < 10)
    hi->notifyReqd += 5;
else if (hi->attempts < 20)
    hi->notifyReqd += 60;
else
    hi->notifyReqd += 60 * 60;
    return -1; 
}

hi->attempts/notifyReqd are updated in memory but never saved in the database, unless I'm mistaken. 

Even if my analysis is wrong, the symptoms are there:
Oct  6 16:13:36 x rpc.statd: Failed to contact host y: RPC: Unknown host
Oct  6 16:14:11 x last message repeated 7 times
Oct  6 16:16:16 x last message repeated 25 times
Oct  6 16:26:23 x last message repeated 88 times
Oct  6 16:36:26 x last message repeated 82 times
Oct  6 16:46:27 x last message repeated 86 times
Oct  6 16:56:29 x last message repeated 85 times
Oct  6 17:06:34 x last message repeated 82 times
Oct  6 17:16:41 x last message repeated 86 times

and:
# db -S v hash /var/db/statd.status
y    )&EC\000\000\000\000\000\000\000\000\000\000\000\^M\000\000

always stays the same (no increasing counter).

It also floods the resolver nicely.
>How-To-Repeat:
Add a host which never resolvs/responds to rpc.statd's notify list.
>Fix:
If I'm right, the fix would be to add a change_host() (and, as a cleanup, change the db->put() in the if-clause to a change_host()) and remove the "goto give_up". Sorry, no patch, I'm brand new to NetBSD.

There have been earlier patches of this nature (forgetting to save/restore), so perhaps a quick review of the code is in order.