Subject: sys_mount/checkdirs race with exit1 for p_cwdi
To: None <tech-kern@netbsd.org>
From: Darrin B.Jewell <dbj@netbsd.org>
List: tech-kern
Date: 10/04/2004 12:58:57
I've recently been debugging a kernel race condition between the
sys_mount() call and exiting processes in exit1()
Specifically, sys_mount() calls checkdirs() which walks the allproc
list to find any processes that have the mountpoint as their current
working directory. Unfortunately, exit1() calls cwdfree(p->p_cwdi)
before the process is removed from the allproc list.
The following patch will allow checkdirs to skip any exiting processes,
but I would like some review before committing.
diff -u -p -r1.211 vfs_syscalls.c
--- src/sys/kern/vfs_syscalls.c 13 Sep 2004 20:02:20 -0000 1.211
+++ src/sys/kern/vfs_syscalls.c 4 Oct 2004 16:50:18 -0000
@@ -400,6 +400,8 @@ checkdirs(olddp)
panic("mount: lost mount");
proclist_lock_read();
LIST_FOREACH(p, &allproc, p_list) {
+ if (p->p_flag & P_WEXIT)
+ continue;
cwdi = p->p_cwdi;
if (cwdi->cwdi_cdir == olddp) {
vrele(cwdi->cwdi_cdir);
I have the following questions:
1. Is there a lock on the p->p_flag field that must be taken
before checking P_WEXIT ?
2. Is it guaranteed that once a process has P_WEXIT set that
cwdfree() will be called?
3. If a process lingers for a long time with P_WEXIT set, will
there be any problems because its cwd may have a reference
to the vnode underneath a mount point?
4. Should cwdfree() be ensuring that the cwdi is invalidated
before it is released back into the pool? Perhaps only
when DIAGNOSTIC is on?
Anything else?
Thanks,
Darrin