NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/53096 (netbsd-8 crash on heavy disk I/O)
The following reply was made to PR kern/53096; it has been noted by GNATS.
From: Roy Bixler <rcbixler%nyx.net@localhost>
To: gnats-bugs%NetBSD.org@localhost
Cc:
Subject: Re: kern/53096 (netbsd-8 crash on heavy disk I/O)
Date: Fri, 6 Apr 2018 08:25:20 -0600
On Fri, Apr 06, 2018 at 10:20:01AM +0000, J. Hannken-Illjes wrote:
> The following reply was made to PR kern/53096; it has been noted by GNATS.
>
> From: "J. Hannken-Illjes" <hannken%eis.cs.tu-bs.de@localhost>
> To: gnats-bugs%NetBSD.org@localhost
> Cc:
> Subject: Re: kern/53096 (netbsd-8 crash on heavy disk I/O)
> Date: Fri, 6 Apr 2018 12:15:22 +0200
>
> From a recent crash it looks like a race condition from procfs_dir().
>
> One proc is running stat or readlink on /proc/XXX/cwd while proc XXX
> changes its working directory. As procfs_dir() takes the cwdi_cdir
> without locking the cwdi it may be vrele'd before getcwd_common()
> adds a reference resulting in the panics observed for this PR.
>
> Please apply this patch, it should fix the race.
I applied the patch to netbsd-8 source pulled from yesterday evening
and I'm running the GDB-enabled kernel now. So far, I haven't seen
any crashes from "rm -r" on a big source tree or from "cvs"
checkouts, but time will tell.
> diff -r eeeea698d964 -r 18ca5463ad55 sys/miscfs/procfs/procfs_vnops.c
> --- sys/miscfs/procfs/procfs_vnops.c
> +++ sys/miscfs/procfs/procfs_vnops.c
> @@ -545,11 +545,10 @@ procfs_symlink(void *v)
> }
>
> /*
> - * Works out the path to (and vnode of) the target process's current
> + * Works out the path to the target process's current
> * working directory or chroot. If the caller is in a chroot and
> * can't "reach" the target's cwd or root (or some other error
> - * occurs), a "/" is returned for the path and a NULL pointer is
> - * returned for the vnode.
> + * occurs), a "/" is returned for the path.
> */
> static void
> procfs_dir(pfstype t, struct lwp *caller, struct proc *target, char **bpp,
> @@ -559,12 +558,12 @@ procfs_dir(pfstype t, struct lwp *caller
> struct vnode *vp, *rvp;
> char *bp;
>
> - cwdi = caller->l_proc->p_cwdi;
> - rw_enter(&cwdi->cwdi_lock, RW_READER);
> -
> - rvp = cwdi->cwdi_rdir;
> - bp = bpp ? *bpp : NULL;
> -
> + /*
> + * Lock target cwdi and take a reference to the vnode
> + * we are interested in to prevent it from disappearing
> + * before getcwd_common() below.
> + */
> + rw_enter(&target->p_cwdi->cwdi_lock, RW_READER);
> switch (t) {
> case PFScwd:
> vp = target->p_cwdi->cwdi_cdir;
> @@ -573,9 +572,17 @@ procfs_dir(pfstype t, struct lwp *caller
> vp = target->p_cwdi->cwdi_rdir;
> break;
> default:
> - rw_exit(&cwdi->cwdi_lock);
> + rw_exit(&target->p_cwdi->cwdi_lock);
> return;
> }
> + vref(vp);
> + rw_exit(&target->p_cwdi->cwdi_lock);
> +
> + cwdi = caller->l_proc->p_cwdi;
> + rw_enter(&cwdi->cwdi_lock, RW_READER);
> +
> + rvp = cwdi->cwdi_rdir;
> + bp = bpp ? *bpp : NULL;
>
> /*
> * XXX: this horrible kludge avoids locking panics when
> @@ -586,6 +593,7 @@ procfs_dir(pfstype t, struct lwp *caller
> *--bp = '/';
> *bpp = bp;
> }
> + vrele(vp);
> rw_exit(&cwdi->cwdi_lock);
> return;
> }
> @@ -594,7 +602,6 @@ procfs_dir(pfstype t, struct lwp *caller
> rvp = rootvnode;
> if (vp == NULL || getcwd_common(vp, rvp, bp ? &bp : NULL, path,
> len / 2, 0, caller) != 0) {
> - vp = NULL;
> if (bpp) {
> bp = *bpp;
> *--bp = '/';
> @@ -604,6 +611,8 @@ procfs_dir(pfstype t, struct lwp *caller
> if (bpp)
> *bpp = bp;
>
> + if (vp != NULL)
> + vrele(vp);
> rw_exit(&cwdi->cwdi_lock);
> }
>
>
--
Roy Bixler <rcbixler%nyx.net@localhost>
"The fundamental principle of science, the definition almost, is this: the
sole test of the validity of any idea is experiment."
-- Richard P. Feynman
Home |
Main Index |
Thread Index |
Old Index