tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

pg_jobc going negative?



pg_jobc is a process group struct member counting the number of
processes with a parent capable of doing job control. Once reaching 0,
the process group is orphaned.

With the addition of asserts checking for pg_jobc > 0 (by maxv@), we
caught issues that pg_jobc can go negative. I have landed new ATF tests
that trigger this reliably with ptrace(2)
(src/tests/lib/libc/sys/t_ptrace_fork_wait.h r.1.7). The problem was
originally triggered with GDB.

There are also other sources of these asserts due to races
The ptrace(2) ATF tests are reliable triggering a negative pg_jobc,
however there are racy tests doing the same as triggered by syzkaller
(mentioned at least in [1]).

Should we allow pg_jobc going negative? (Other BSDs allow this.)

Is going negative in the first place a real bug?

Are the races triggered by syzkaller real bugs?

[1] http://mail-index.netbsd.org/current-users/2020/05/04/msg038511.html

On 20.04.2020 18:32, Maxime Villard wrote:
> Module Name:	src
> Committed By:	maxv
> Date:		Mon Apr 20 16:32:03 UTC 2020
> 
> Modified Files:
> 	src/sys/kern: kern_proc.c
> 
> Log Message:
> Add three KASSERTs, to detect refcount bugs.
> 
> This narrows down an unknown bug in some place near, that has manifested
> itself in various forms (use-after-frees, uninit accesses, page faults,
> segmentation faults), all pointed out by syzbot.
> 
> The first KASSERT in fixjobc() fires when the bug is encountered.
> 
> 
> To generate a diff of this commit:
> cvs rdiff -u -r1.244 -r1.245 src/sys/kern/kern_proc.c
> 
> Please note that diffs are not public domain; they are subject to the
> copyright notices on the relevant files.
> 
> 
> Modified files:
> 
> Index: src/sys/kern/kern_proc.c
> diff -u src/sys/kern/kern_proc.c:1.244 src/sys/kern/kern_proc.c:1.245
> --- src/sys/kern/kern_proc.c:1.244	Sun Apr 19 20:31:59 2020
> +++ src/sys/kern/kern_proc.c	Mon Apr 20 16:32:03 2020
> @@ -1,4 +1,4 @@
> -/*	$NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej Exp $	*/
> +/*	$NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp $	*/
>  
>  /*-
>   * Copyright (c) 1999, 2006, 2007, 2008, 2020 The NetBSD Foundation, Inc.
> @@ -62,7 +62,7 @@
>   */
>  
>  #include <sys/cdefs.h>
> -__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.244 2020/04/19 20:31:59 thorpej Exp $");
> +__KERNEL_RCSID(0, "$NetBSD: kern_proc.c,v 1.245 2020/04/20 16:32:03 maxv Exp $");
>  
>  #ifdef _KERNEL_OPT
>  #include "opt_kstack.h"
> @@ -554,6 +554,7 @@ proc_sessrele(struct session *ss)
>  {
>  
>  	KASSERT(mutex_owned(proc_lock));
> +	KASSERT(ss->s_count > 0);
>  	/*
>  	 * We keep the pgrp with the same id as the session in order to
>  	 * stop a process being given the same pid.  Since the pgrp holds
> @@ -1181,8 +1182,11 @@ fixjobc(struct proc *p, struct pgrp *pgr
>  		if (entering) {
>  			pgrp->pg_jobc++;
>  			p->p_lflag &= ~PL_ORPHANPG;
> -		} else if (--pgrp->pg_jobc == 0)
> -			orphanpg(pgrp);
> +		} else {
> +			KASSERT(pgrp->pg_jobc > 0);
> +			if (--pgrp->pg_jobc == 0)
> +				orphanpg(pgrp);
> +		}
>  	}
>  
>  	/*
> @@ -1197,8 +1201,11 @@ fixjobc(struct proc *p, struct pgrp *pgr
>  			if (entering) {
>  				child->p_lflag &= ~PL_ORPHANPG;
>  				hispgrp->pg_jobc++;
> -			} else if (--hispgrp->pg_jobc == 0)
> -				orphanpg(hispgrp);
> +			} else {
> +				KASSERT(hispgrp->pg_jobc > 0);
> +				if (--hispgrp->pg_jobc == 0)
> +					orphanpg(hispgrp);
> +			}
>  		}
>  	}
>  }
> 


Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index