Re: lib/46111: yplib will hang forever if no server can be found

The following reply was made to PR lib/46111; it has been noted by GNATS.

From: (Christos Zoulas)
To: Wolfgang Stukenbrock <>,
Subject: Re: lib/46111: yplib will hang forever if no server can be found
Date: Wed, 29 Feb 2012 11:14:56 -0500

 On Feb 29,  4:35pm, 
(Wolfgang Stukenbrock) wrote:
 -- Subject: Re: lib/46111: yplib will hang forever if no server can be found
 | Hi,
 | I'm not shure if a function is a good idea, because there is already a 
 | global variable _yplib_nerrs that is set to 5 as default.
 | As far as I've figured out _yplib_nerrs is used in some files in 
 | src/lib/libc/yp and is not documented in any headerfiles and/or the 
 | manual. It seems to be a global libc-internal variable.
 | A disadvantage of a function is the missing ability to query the current 
 | state.
 | So I would prefer a variable, but I can implement a function too, if you 
 | prefer this.
 make it return the current state then, and ignore the setting if the setting
 is negative.
 | Currently after _yplib_nerrs number of retries the message "YP server 
 | for ... still trying" is printed, but only if the domain is already 
 | found in the _ypbindlist. No printing is done for the first contact with 
 | ypbind - or I've overlooked something.
 | This looks like someone has changed the original Sun behaviour in the 
 | past that is printing this message as far as I remember even for the 
 | first connection attemp.
 | remark: an entry is added to the _ypbindlist after successfull 
 | contacting ypbind or if a binding file is present and valid.
 | If there is a binding file present for the domain, and the file locking 
 | fails, YPERR_YPBIND is returned without contacting ypbind.
 | When contacting ypbind, YPERR_YPBIND is returned too if clnttcp_create() 
 | fails.
 | Theese both should be very rare situations, but will already currently 
 | not wait until a server is available.
 | In e.g. yp_first and yp_next the variable _yplib_nerrs is used to print 
 | a message too if the binding has succeeded but the clnt_call() has 
 | failed the configured number of times. This will be done in an endless 
 | loop too, if the binding succeded every time.
 | Should this behaviour be changed too? If the binding fails, YPERR_DOMAIN 
 | is returned.
 | I aggree not to change de default behaviour of the lib, but allow to 
 | change it unter program control. At least I need currently need the 
 | possibility to catch the problem with ypbind to faile my request and 
 | continue to work.
 | My "current" patch will return YPERR_YPBIND only if there is no 
 | previously setup entry in the _ypbindlist.
 | Should I also terminate the processing after <n> retries, if there is an 
 | entry in _ypbindlist present? I think yes.
 | Should in such case the "still trying" message be written to stderr or 
 | should the lib be silent. I think no printing should be done.
 | What about the following sollution:
 | I reuse _yplib_nerrs for the new functionality.
 |    _yplib_nerrs > 0 - current behaviour - wait endless and print a 
 | message every _yplib_nerrs retries. The default value will be 5 as before.
 |    _yplib_nerrs == 0 - wait endless without printing to stderr
 |    (this is something that would already happen now when _yplib_nerrs 
 | gets set to 0, because the print-check check starts with 1 and the first 
 | printout will happen on the integer-wrap ...)
 |    _yplib_nerrs < 0 - number of retries (as negative count) with ypbind 
 | after that YPERR_YPBIND is returned. And the lib will return regardless 
 | if the domainname has been queried before or not.
 | I will add it to the rpcsvc/ypclnt.h headerfile as extern declaration 
 | and add it the to ypclnt(3) manual.
 | This implies disabling the printing in the other routines (yp_first etc. 
 | ) too on none-positive retry numbers.
 | Is this OK?
 I think it is cleaner not to oveload the variable this way and use a new one.
 Alterning the behavior of internal variables that are present in most reference
 implementations is not a good idea, because if you take the code from NetBSD
 to another OS it will not behave as expected. With the new function at least
 you get a link error that tells you that you need to do something else.

