Subject: NetBSD-3 NIS-compat getpwnam()/getpwuid iterate entire map [was Re: 3.0 YP lookup latency]
To: Brian Ginsbach <ginsbach@NetBSD.org>
From: None <jonathan@dsg.stanford.edu>
List: tech-net
Date: 06/21/2006 13:02:40
In message <20060621190837.GA8936@NetBSD.org>, Brian Ginsbach writes:

>On Wed, Jun 21, 2006 at 09:42:15AM -0700, jonathan@dsg.stanford.edu wrote:
>> I'm pretty sure the bug is  the client NIS library. If you look at
>> 	lib/libc/gen/getpwent.c
>> 
>> you will notice that file changed radically between (CVS branches)
>> netbsd-2 and netbsd-3.  getpwent.c only calls yp_match() or
>> yp_first()/yp_next() in a couple of places.  Late last night I emailed
>> you and Stephen and Soda-san a walk through the relevant code-paths. I
>> think I've identified the problem, and suggested a workaround: don't
>> use the supplied default nsswitch.conf
>> 
>> 	passwd:  compat
>> 
>> line, but instead use
>> 
>> 	passswd: nis [notfound=return] files
>> 
>> which avoids the compat_ parsing routine where I'm pretty this bug
>> resides.  I also suggested a fix (assuming the workaround does fix
>> Stephen's problem).
>> 
>
>Where exactly do you see the problem?  I think that _compat_pwscan()
>is broken.  Specifically the call to _passwdcompat_pwscan() for
>COMPAT_FULL should probably pass search rather than _PW_KEYBYNUM.
>Was this your suggested fix?


Nope. The analysis I emailed to Christos was _compat_pwscan() does not
meet the contract expected by its callers, as described in the
function-header comment of _compat_pwscan().

_compat_pwscan is called to handle "passwd: compat" handling for each
of getpwent(), getpwnam(), and getpwuid().  Those three callers pass a
`search' (third-to-last) argument of _PW_KEYBYNUM, _PW_KEYBYNAME, and
_PW_KEYBYUID, respectively.
_compat_pwscan is expected to do the appropriate thing. However,
as part of implementing the "compat" local-passwd redirection for
	-
	-name
	-@netgroup
	+
	+name
	+@netgroup

entries, _compat_pwscan() uses its compat_state->mode argument to
determine how to proceed.  In the switch for compat_state->mode ==
_PW_FULL (a "+" line), _compat_pwscan() is unconditionally calling
_passwdcompat_pwscan() with _PW_KEYBYNUM.  *THAT* is what forces
iteration through the entire map, via yp_first()/yp_next(). The
surrounding state machinery (at the end of the loop; terrible style)
ensures that when compat_state is set to _PW_FULL, we also call
_passwdcompat_setpassent(). the for(;;) loop then iterates over the entire compat map.

Given that analysis, my suggested fix is to rework the code

                        switch (state->mode) {

                        case COMPAT_FULL:
                                        /* get next user */
                                rv = _passwdcompat_pwscan(&cpw,
                                    cbuf, sizeof(cbuf),
                                    _PW_KEYBYNUM, NULL, 0);
                                
to instead dispatch on the `search' argument, and handle each of
_PW_KEYBYNUM, _PW_KEYBYNAME, and _PW_KEYBYUID appropriately.

For the first case, we follow the current code. For the latter two, we
have the key for passwd.byname or passwd.byuid, respectively, so w can
just yp_match(), process the entry if found and update the loop state
to proceed to the next line.

The non-"compat" lines shown in nsswitch.conf(5) don't go through
_compat_pwscan; so Stephen should be able to work around the
map-enumeration by reworking his nsswitch.conf to not use the
"passwd" compat".  Hmm, whilst I typed this, Stephen confirmed that my
workaround masked the problem.


Oh... and I noticed a bug in clnt_generic(): for RPC-over-TCP, we
disable RFC-896 (Nagle) processing for TCP over IPv4, but not for TCP
over IPv6[1].  Makes a grown man cry.

[1] R. W. Scheifler and J. Gettys, The X Window System, ACM Transactions on Graphics 16:8 (Aug. 1983),
    pp. 57-69. Citation index  online at: http://citeseer.ist.psu.edu/context/12617/0