NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

bin/50439: rpcbind follies with nis down

>Number:         50439
>Category:       bin
>Synopsis:       rpcbind follies with nis down
>Confidential:   no
>Severity:       critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 17 09:40:00 +0000 2015
>Originator:     David A. Holland
>Release:        NetBSD 7.99.20 (20150727)
System: NetBSD macaran 7.99.20 NetBSD 7.99.20 (MACARAN) #30: Mon Jul 27 20:25:15 EDT 2015  dholland@macaran:/usr/src/sys/arch/amd64/compile/MACARAN amd64
Architecture: x86_64
Machine: amd64

	Now that ypbind has been fixed to not explode the world when
	the network goes down, it seems that rpcbind takes over

	When the NIS server goes down, the libc NIS code contacts
	rpcbind, producing this message:

Nov 13 19:00:00 macaran rpcbind: connect from to getport/addr(ypbind)

	Each time this happens it seems to produce another fork of
	rpcbind. In the course of a ~1h30 network downtime a couple
	days ago, process accounting logged 1449403 rpcbind processes
	exiting. This (and/or possibly related phenomena occurring in
	the libc NIS code) was sufficient to run through 12G of ram
	and swap and then OOM. This took out the X server of course
	and thus I don't have as much information as I'd like about
	what actually happened.


	Be using NIS; disconnect the network with a lot of stuff


	rpcbind apparently forks every time it wants to log a message.
	This is silly; it shouldn't need to fork more than once

	However, I think the real problem lies in the libc NIS code; I
	think it is probably doing something stupid that leads it to
	blast rpcbind unnecessarily. I had a fair amount of stuff
	running when the network went plop, but not 1.4 million
	processes or even 14,000.

	Unfortunately, nuking NIS from orbit isn't an option.

Home | Main Index | Thread Index | Old Index