Subject: Re: 3.0 YP lookup latency
To: None <>
From: Christos Zoulas <>
List: tech-net
Date: 06/21/2006 12:52:49
On Jun 21,  9:42am, ( wrote:
-- Subject: Re: 3.0 YP lookup latency

| In message <>Christos Zoulas writes
| >On Jun 20,  2:48pm, (Stephen Jones) wrote:
| >-- Subject: Re: 3.0 YP lookup latency
| >
| >| Okay, this daily build (netbsd-3-0/200605220000Z) is displaying the  
| >| symptoms.
| >| It doesn't appear to be in rpcbind or ypbind itself, but an  
| >| associated library.
| >| It seems when the query is made the the ypserv process on the server  
| >| will log each
| >| user id in the password file when the -l flag is on.  Interestingly  
| >| the userids are
| >| sorted in alphabetical order .. any reason for this? ;-)
| >
| >My guess is that it is an artifact of the passwd file being hashed into
| >a db file. I guess I'll have to setup a yp domain myself and test.
| hi Christos,
| I don't buy your guess. As Stephen clarified (after the message to
| which you replied):
| On Stephen's NetBSD-2.1 NIS client hosts, the client is issuing a
| single yp_match() call to the server.  But on Stephen's 3.0 clients,
| running the same userland tool (which, barring explicit size_t casts),
| hasn't changed between NetBSD-2 and NetBSD-3) the client iterates over
| Stephen's entire 27,000-entry NIS passwd.byname map, via yp_first()/yp_next().
| That's where the tens of seconds bites: not one individual RPC call
| but the 27,000-odd yp_next() calls.
| I'm pretty sure the bug is  the client NIS library. If you look at
| 	lib/libc/gen/getpwent.c
| you will notice that file changed radically between (CVS branches)
| netbsd-2 and netbsd-3.  getpwent.c only calls yp_match() or
| yp_first()/yp_next() in a couple of places.  Late last night I emailed
| you and Stephen and Soda-san a walk through the relevant code-paths. I
| think I've identified the problem, and suggested a workaround: don't
| use the supplied default nsswitch.conf
| 	passwd:  compat
| line, but instead use
| 	passswd: nis [notfound=return] files
| which avoids the compat_ parsing routine where I'm pretty this bug
| resides.  I also suggested a fix (assuming the workaround does fix
| Stephen's problem).
| In message Message-Id: <>,
| Christos Zoulas continued:
| >Sure I would be happy to work with you to resolve this. The first thing
| >to do is to ktrace both the server and the client process and then do
| >a kdump -R to see between which 2 system calls we have the most delay.
| I understand why you ask (I initially asked for a libpcap trace).
| But, given Stephen's observation about his NIS-server logs I don't
| think either one would help.  yp_match() and yp_first()/yp_next() are
| libc functions, not system calls. So a ktrace would show the
| reads()/write() calls for the 27,000-odd yp_next() calls which we know
| (from Stephen's server-side logs) the NetBSD-3.0 NIS client is issuing.

Yes, it should show the difference between the # of calls in 3.x and
the number of calls in 2.x. But I am convinced that the reason is what
you mentioned; it should not iterate through the whole map.