tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: In-kernel process exit hooks?



On Sat, 9 Jan 2016, Mateusz Guzik wrote:

While I don't know all the details, it is clear that the purpose would
be much better served by ktrace and I would argue efforts should be
spent there.

filemon's purpose is somewhat different than that of ktrace. There is a limited set of events that filemon captures, and it only captures successful operations. Its output is quite simple and easy to parse by a human of shell script.

bmake takes filemon output and copies it into some file.

Yes. In "meta-mode" it will use the output from filemon to capture dependency information.

From usability perspective what seems to be needed is few hacks to
ktrace to only log the stuff make is interested in and parsing code for
bmake to translate to what looks like current filemon format for
aformentioned copying.

Perhaps. But filemon is also a run-time loadable module (and it is preferred that it NOT be built-in to any kernels). If I remember correctly, ktrace has hooks all over the system and does not readily lend itself to modularization.

I only had a brief look at ktrace implementation. It uses global locks,
so that would have to be worked on, but it should be trivial to at least
make tracees using different output files not compete with each other.

So, what's so fitting about ktrace design: tracing is tied to the target
process, can react to things like changing credentials and reparenting
(fork + parent exit), but most importantly it does not make assumptions
about things which can change without its control (see below).

Filemon monitors stuff at syscall level, stores the fd and remembers
pids.

The fd is no longer being stored... :) But yes, it does still remember pids, so it can determine if an event belongs to a monitored process tree. Again IIRC, ktrace is somewhat more intrusive, in that it manipulates various process flags which can affect behavior of the target (monitored) process.

Let's see how this leads to trouble. Some of these problems were also
present in FreeBSD implementation (and some still are). A bonus
interesting fact is that implementations have diverged, although are
still fundamentally broken.

http://nxr.netbsd.org/xref/src/sys/dev/filemon/filemon.c#filemon_ioctl
       case FILEMON_SET_PID:
               /* Set the monitored process ID - if allowed. */
               mutex_enter(proc_lock);
               tp = proc_find(*((pid_t *) data));

There are no steps taken to keep tp around nor in a certain state (see
below). proc_find only finds the pointer.

We really don't need further access to the struct proc...

               mutex_exit(proc_lock);
               if (tp == NULL ||
                   tp->p_emul != &emul_netbsd) {

tp could have exited and be freed by now. Now that the comparison was
done it could have p_emul changed.

The mutex_exit() has been moved to after the last use of data which is protected by the mutex.

                       error = ESRCH;
                       break;
               }

               error = kauth_authorize_process(curproc->p_cred,
                   KAUTH_PROCESS_CANSEE, tp,
                   KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);

Now that the check is pased the process could have executed something
setuid.

Which process could have executed something setuid? How would that matter? (I will confess I haven't checked, but I would be surprised if "executing something setuid" would change the result of kauth() check. But I could be wrong here.)

I'm somewhat surprised this code is here. Should not
kauth_authorize_process assert that tp has stable credentials in some
way, in effect just crashing filemon consumers?

               if (!error) {
                       filemon->fm_pid = tp->p_pid;
               }

Why does the target process need stable credentials? It's not doing anything for filemon.

Pid is stored, but tp is not held. What if the 'traced' process exits?
grep only reveals the following bit in filemon_wrapper_sys_exit:

               if (filemon->fm_pid == curproc->p_pid)
                       filemon_printf(filemon, "# Bye bye\n");

So filemon will continue tracing anything which happened to reuse the
pid.

Yes, this is an outstanding issue. More particularly, the new user of the pid might not be accessible by the monitoring process (ie, it would fail the earlier kauth() check). But holding the pointer to the proc isn't really the answer either - although less likely it is entirely possible that the address could have been reused.

(It's too bad we don't have "generation numbers" on the pid, which would easily detect reuse.)

"Tracing points" are at syscall layer. With only pid remembered this
adds up to serious breakage induced by the need to look the filemon structure
up each time, and based on unreliable data.

static struct filemon *
filemon_pid_check(struct proc * p)
{
       struct filemon *filemon;
       struct proc * lp;

       if (!TAILQ_EMPTY(&filemons_inuse)) {
               while (p) {
                       /*
                        * make sure p cannot exit
                        * until we have moved on to p_pptr
                        */
                       rw_enter(&p->p_reflock, RW_READER);
                       TAILQ_FOREACH(filemon, &filemons_inuse, fm_link) {
                               if (p->p_pid == filemon->fm_pid) {
                                       rw_exit(&p->p_reflock);
                                       return (filemon);
                               }
                       }

But p_pid should be constant for 'struct proc' lifetime, so locking this
early is wasteful.

                       lp = p;
                       p = p->p_pptr;
                       rw_exit(&lp->p_reflock);

I guess the lock is needed to read the pointer to the parent.

Correct.

Nothing was done to keep the parent around, which means it could have
exited and be freed by now. Which in turn makes the second loop
iteration a use-after-free.

Hmmm, this code used to work. Looks like the locking got messed up somewhere. It should do (in pseudo code - I'm too lazy to write it all out!)

	lock(p)
	while (p != NULL) {
		TAILQ_FOREACH(filemon) {
			if (p->p_pid == filemon->fm_pid) {
				release(p);
				return filemon;
			}
		}
		last_p = p;
		p = p->pptr;
		if (p != NULL)
			lock(p);
		release(last_p);
	}

We should lock the new (parent) process before releasing the initial lock.

I will fix this.


The first iteration is safe if the argument is curproc (which it is), as
it clearly cannot disappear.

Turns out this traversal is even more wrong since p_pptr does not have
to be the process you are looking for - ptrace is reparenting tracees.

If the process has been reparented, then its activity will no longer be monitored and reported by filemon.

               }
       }
       return (NULL);
}

+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+


Home | Main Index | Thread Index | Old Index