tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Anomalies while handling p_nstopchild count



The attached patch should handle this.  It takes out the proc_lock and
then updates the process's p_stat, p_waited, and parent's p_nstopchild
with the lock held.  It then releases the lock.

In the normal exit case, it reacquires the lock and restores p_stat to
its previous value (which almost certainly is SACTIVE, since that's
the value that do_posix_spawn() set, before dispatching the new lwd
for scheduling).

In the several error cases, we go to a new error-exit code where we
again reacquire proc_lock and restore p_stat and p_nstopchild before
continuing with the rest of the error path.

Note that this change has the effect of restoring p_stat in the error
branch.  This should not be an issue, since the error code will very
soon call exit1() where the process will be properly disposed.

Note that I have not tested this patch yet, not even compile-test. I
would appreciate additional eyes to review this.

:)




On Sun, 11 Oct 2015, Martin Husemann wrote:

On Sun, Oct 11, 2015 at 04:39:55AM +0700, Robert Elz wrote:
This in spawn_return() [kern_exec.c]

There the status of the process is set to SSTOP without incrementing
the parent's p_nstopchild, violating the definition of that field.
But then it is (fairly soon after) set back again .. if no error occurred.

If there was an error, spawn_return() calls exit1(), which (eventually)
sets the process state to SDEAD and increments the parent's p_nstopchild
(regardless of the state of the process when exit1() was called).
(Before SDEAD the state is set to SDYING, where p_nstopchild should not
count it.)

Hence if p_nstopchild had been incremented in spawn_return() and not
decremented again (as the state is left at SSTOP in the error case),
then exit1() would cause the process to be counted twice.

We should still fix this error path!

Martin


+------------------+--------------------------+-------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org  |
+------------------+--------------------------+-------------------------+
Index: kern_exec.c
===================================================================
RCS file: /cvsroot/src/sys/kern/kern_exec.c,v
retrieving revision 1.418
diff -u -p -r1.418 kern_exec.c
--- kern_exec.c	2 Oct 2015 16:54:15 -0000	1.418
+++ kern_exec.c	11 Oct 2015 09:26:04 -0000
@@ -1983,6 +1983,7 @@ spawn_return(void *arg)
 	struct spawn_exec_data *spawn_data = arg;
 	struct lwp *l = curlwp;
 	int error, newfd;
+	int ostat;
 	size_t i;
 	const struct posix_spawn_file_actions_entry *fae;
 	pid_t ppid;
@@ -2055,7 +2056,6 @@ spawn_return(void *arg)
 
 	/* handle posix_spawnattr */
 	if (spawn_data->sed_attrs != NULL) {
-		int ostat;
 		struct sigaction sigact;
 		sigact._sa_u._sa_handler = SIG_DFL;
 		sigact.sa_flags = 0;
@@ -2064,8 +2064,12 @@ spawn_return(void *arg)
 		 * set state to SSTOP so that this proc can be found by pid.
 		 * see proc_enterprp, do_sched_setparam below
 		 */
+		mutex_enter(proc_lock);
 		ostat = l->l_proc->p_stat;
 		l->l_proc->p_stat = SSTOP;
+		l->l_proc_p_waited = 0;
+		l->l_proc->p_pptr->p_nstopchild++
+		mutex_exit(proc_lock);
 
 		/* Set process group */
 		if (spawn_data->sed_attrs->sa_flags & POSIX_SPAWN_SETPGROUP) {
@@ -2078,7 +2082,7 @@ spawn_return(void *arg)
 			error = proc_enterpgrp(spawn_data->sed_parent,
 			    mypid, pgrp, false);
 			if (error)
-				goto report_error;
+				goto report_error_stopped;
 		}
 
 		/* Set scheduler policy */
@@ -2092,7 +2096,7 @@ spawn_return(void *arg)
 			    SCHED_NONE, &spawn_data->sed_attrs->sa_schedparam);
 		}
 		if (error)
-			goto report_error;
+			goto report_error_stopped;
 
 		/* Reset user ID's */
 		if (spawn_data->sed_attrs->sa_flags & POSIX_SPAWN_RESETIDS) {
@@ -2100,12 +2104,12 @@ spawn_return(void *arg)
 			     kauth_cred_getgid(l->l_cred), -1,
 			     ID_E_EQ_R | ID_E_EQ_S);
 			if (error)
-				goto report_error;
+				goto report_error_stopped;
 			error = do_setresuid(l, -1,
 			    kauth_cred_getuid(l->l_cred), -1,
 			    ID_E_EQ_R | ID_E_EQ_S);
 			if (error)
-				goto report_error;
+				goto report_error_stopped;
 		}
 
 		/* Set signal masks/defaults */
@@ -2115,7 +2119,7 @@ spawn_return(void *arg)
 			    &spawn_data->sed_attrs->sa_sigmask, NULL);
 			mutex_exit(l->l_proc->p_lock);
 			if (error)
-				goto report_error;
+				goto report_error_stopped;
 		}
 
 		if (spawn_data->sed_attrs->sa_flags & POSIX_SPAWN_SETSIGDEF) {
@@ -2136,7 +2140,10 @@ spawn_return(void *arg)
 					    0);
 			}
 		}
+		mutex_enter(proc_lock);
 		l->l_proc->p_stat = ostat;
+		l->l_proc->p_pptr->p_nstopchild--
+		mutex_exit(proc_lock);
 	}
 
 	/* now do the real exec */
@@ -2163,6 +2170,11 @@ spawn_return(void *arg)
 	/* NOTREACHED */
 	return;
 
+report_error_stopped:
+	mutex_enter(proc_lock);
+	l->l_proc->p_stat = ostat;
+	l->l_proc->p_pptr->p_nstopchild--
+	mutex_exit(proc_lock);
  report_error:
 	if (have_reflock) {
 		/*


Home | Main Index | Thread Index | Old Index