Subject: Re: NetBSD-3 NIS-compat getpwnam()/getpwuid iterate entire map [was Re: 3.0 YP lookup latency]
To: Brian Ginsbach <ginsbach@NetBSD.org>
From: Jonathan Stone <jonathan@Pescadero.dsg.stanford.edu>
List: tech-net
Date: 06/21/2006 14:49:23
In message <20060621205606.GC8444@NetBSD.org>Brian Ginsbach writes
>On Wed, Jun 21, 2006 at 01:02:40PM -0700, jonathan@dsg.stanford.edu wrote:
>> 
>> In message <20060621190837.GA8936@NetBSD.org>, Brian Ginsbach writes:
>> 
>> >
>> >Where exactly do you see the problem?  I think that _compat_pwscan()
>> >is broken.  Specifically the call to _passwdcompat_pwscan() for
>> >COMPAT_FULL should probably pass search rather than _PW_KEYBYNUM.
>> >Was this your suggested fix?
>> 
>> 
>> Nope. The analysis I emailed to Christos was _compat_pwscan() does not
>> meet the contract expected by its callers, as described in the
>> function-header comment of _compat_pwscan().
>
>Hmm, I'd say "nope" is a bit harsh.  I just simplified things a bit
>(and over looked that it needs the search arguments passed too).

Hi Brian, 

The "nope" was merely, "nope, that wasn't what I had suggested".
I did not mean to imply any judgment  about your suggested fix.

>It should be as simple as changing the above call to
>_passwdcompat_pwscan().
>
>                                rv = _passwdcompat_pwscan(&cpw,
>                                    cbuf, sizeof(cbuf),
>                                    search, name, uid);

Nope (again, no value-judgment implied). See the comment one line
above your framgment which says

	/* get next user */ 


For a request where we have the key, we should *not* "get the next user":
we sohuld do a lookup by key, rather than   iterate over all
users in the compat map.

Your suggested fix probably does cure the immedaite problem; but (on
several aspects) it leaves the code in a state well below our par for
code quality. One aspect is that that the comments don't match
reality. Another is that we're relying on an (undocumented/uncommented) implementation
artifact of the internal helper functions.  Yet another is that the (single) loop
in _compat_pwscan() is used to both loop the local file with +/= entries;
and also (mis)used to loop over the entries in
the compat source, for non-key lookups.

I submit that those flaws have contributed to the bug under discussion.

I also submit that (given this bug!) the control-flow inside
_compat_pwscan(), and the coupling between it and
_passwdcompat_pwscan(), is too hairy (or too hairy for the available
time) of the average NetBSD developer.  Therefore the flaws should be
remedied, if possible. I had thought that went without saying, but
(given off-list comments), apparently it doesn't.  Brian, that's not
in any way aimed at you, and I'm sorry to be so long-winded, but
apparently that all *does* need to be said.

I also agree with Christos' subsequent observation that a deeper
rework is riskier than your approach.  So perhaps your suggested fix
is acutally the best candidate for a pullup to NetBSD-3.

Brian: if you can verify that your fix works, please commit it,
co-ordinate pullups, and let me know when it's safe to replace with a
reworked version.