NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/50298: Questionable code in kern_exec.c



>Number:         50298
>Category:       kern
>Synopsis:       Questionable code in kern_exec.c
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 04 13:25:00 +0000 2015
>Originator:     Robert Elz
>Release:        NetBSD 7.99.21 (current as of 2015-10-01)
>Organization:
	Prince of Songkla University
>Environment:
System: NetBSD andromeda.noi.kre.to 7.99.21 NetBSD 7.99.21 (VBOX64-1.1-20150829) #3: Sun Aug 30 07:16:17 ICT 2015 kre%andromeda.noi.kre.to@localhost:/home/kre/src/current-kernel/usr/src/sys/arch/amd64/compile/VBOX64 amd64
Architecture: x86_64
Machine: amd64
>Description:
	While looking into Paul Goyette's unkillable zombie
	problem (and p_nstopchild in particular), I came across
	the folowing code in kern_exec.c (in execve_runproc()) ...

        if (p->p_sflag & PS_STOPEXEC) {
                ksiginfoq_t kq;

                KERNEL_UNLOCK_ALL(l, &l->l_biglocks);
                p->p_pptr->p_nstopchild++;
                p->p_pptr->p_waited = 0;
		/* .... */

	I doubt this is in any way related to the problem
	under investigation, but it looks obviously wrong.

	The p_waited that is cleared should be p->p_waited
	not the flag in the parent process.

	That is, p_nstopchild (in the parent) is a count of
	its child processes on which a wait() should succeed.
	A child that is stopped (rather than dead) as is the
	case here, should be wait()'d for exactly once.
	p_waited (in the child) determines whether that
	has happened or not, for the qhile in question
	(when p_waited is set to 1, p->p_pptr->p_nstopchild
	gets decremented by one).

	Clearing the parent's p_waited flag (without possibly
	adjusting its parent's p_nstopchild) is wrong.

	Incrementing our parent's p_nstopchild without
	verifying (or forcing) our p_waited to 0 is also
	wrong (if p->p_waited != 0) then this process can
	never be waited for.

>How-To-Repeat:
	By inspection.   I'm not aware of any way to actually
	force a problem to occur here (which doesn't mean there
	isn't one).

>Fix:
	s/p->p_pptr->p_waited/p->p_waited/

	(probably).

	If someone can explain why the code as it is now is
	actually correct, please just close this PR.



Home | Main Index | Thread Index | Old Index