Re: In-kernel process exit hooks?

To: Mateusz Guzik <mjguzik%gmail.com@localhost>
Subject: Re: In-kernel process exit hooks?
From: Paul Goyette <paul%vps1.whooppee.com@localhost>
Date: Sat, 9 Jan 2016 14:25:02 +0800 (PHT)

On Sat, 9 Jan 2016, Mateusz Guzik wrote:

While I don't know all the details, it is clear that the purpose would
be much better served by ktrace and I would argue efforts should be
spent there.

filemon's purpose is somewhat different than that of ktrace. There is alimited set of events that filemon captures, and it only capturessuccessful operations. Its output is quite simple and easy to parse bya human of shell script.

bmake takes filemon output and copies it into some file.

Yes. In "meta-mode" it will use the output from filemon to capturedependency information.

From usability perspective what seems to be needed is few hacks to

ktrace to only log the stuff make is interested in and parsing code for
bmake to translate to what looks like current filemon format for
aformentioned copying.

Perhaps. But filemon is also a run-time loadable module (and it ispreferred that it NOT be built-in to any kernels). If I remembercorrectly, ktrace has hooks all over the system and does not readilylend itself to modularization.

I only had a brief look at ktrace implementation. It uses global locks,
so that would have to be worked on, but it should be trivial to at least
make tracees using different output files not compete with each other.

So, what's so fitting about ktrace design: tracing is tied to the target
process, can react to things like changing credentials and reparenting
(fork + parent exit), but most importantly it does not make assumptions
about things which can change without its control (see below).

Filemon monitors stuff at syscall level, stores the fd and remembers
pids.

The fd is no longer being stored... :) But yes, it does still rememberpids, so it can determine if an event belongs to a monitored processtree. Again IIRC, ktrace is somewhat more intrusive, in that itmanipulates various process flags which can affect behavior of thetarget (monitored) process.

Let's see how this leads to trouble. Some of these problems were also
present in FreeBSD implementation (and some still are). A bonus
interesting fact is that implementations have diverged, although are
still fundamentally broken.

http://nxr.netbsd.org/xref/src/sys/dev/filemon/filemon.c#filemon_ioctl

       case FILEMON_SET_PID:
               /* Set the monitored process ID - if allowed. */
               mutex_enter(proc_lock);
               tp = proc_find(*((pid_t *) data));


There are no steps taken to keep tp around nor in a certain state (see
below). proc_find only finds the pointer.


We really don't need further access to the struct proc...

               mutex_exit(proc_lock);
               if (tp == NULL ||
                   tp->p_emul != &emul_netbsd) {


tp could have exited and be freed by now. Now that the comparison was
done it could have p_emul changed.

The mutex_exit() has been moved to after the last use of data which isprotected by the mutex.

                       error = ESRCH;
                       break;
               }

               error = kauth_authorize_process(curproc->p_cred,
                   KAUTH_PROCESS_CANSEE, tp,
                   KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);


Now that the check is pased the process could have executed something
setuid.

Which process could have executed something setuid? How would thatmatter? (I will confess I haven't checked, but I would be surprised if"executing something setuid" would change the result of kauth() check.But I could be wrong here.)

I'm somewhat surprised this code is here. Should not
kauth_authorize_process assert that tp has stable credentials in some
way, in effect just crashing filemon consumers?

               if (!error) {
                       filemon->fm_pid = tp->p_pid;
               }

Why does the target process need stable credentials? It's not doinganything for filemon.

Pid is stored, but tp is not held. What if the 'traced' process exits?
grep only reveals the following bit in filemon_wrapper_sys_exit:

               if (filemon->fm_pid == curproc->p_pid)
                       filemon_printf(filemon, "# Bye bye\n");


So filemon will continue tracing anything which happened to reuse the
pid.

Yes, this is an outstanding issue. More particularly, the new user ofthe pid might not be accessible by the monitoring process (ie, it wouldfail the earlier kauth() check). But holding the pointer to the procisn't really the answer either - although less likely it is entirelypossible that the address could have been reused.

(It's too bad we don't have "generation numbers" on the pid, which wouldeasily detect reuse.)

"Tracing points" are at syscall layer. With only pid remembered this
adds up to serious breakage induced by the need to look the filemon structure
up each time, and based on unreliable data.

static struct filemon *
filemon_pid_check(struct proc * p)
{
       struct filemon *filemon;
       struct proc * lp;

       if (!TAILQ_EMPTY(&filemons_inuse)) {
               while (p) {
                       /*
                        * make sure p cannot exit
                        * until we have moved on to p_pptr
                        */
                       rw_enter(&p->p_reflock, RW_READER);
                       TAILQ_FOREACH(filemon, &filemons_inuse, fm_link) {
                               if (p->p_pid == filemon->fm_pid) {
                                       rw_exit(&p->p_reflock);
                                       return (filemon);
                               }
                       }


But p_pid should be constant for 'struct proc' lifetime, so locking this
early is wasteful.

                       lp = p;
                       p = p->p_pptr;
                       rw_exit(&lp->p_reflock);


I guess the lock is needed to read the pointer to the parent.


Correct.

Nothing was done to keep the parent around, which means it could have
exited and be freed by now. Which in turn makes the second loop
iteration a use-after-free.

Hmmm, this code used to work. Looks like the locking got messed upsomewhere. It should do (in pseudo code - I'm too lazy to write it allout!)


	lock(p)
	while (p != NULL) {
		TAILQ_FOREACH(filemon) {
			if (p->p_pid == filemon->fm_pid) {
				release(p);
				return filemon;
			}
		}
		last_p = p;
		p = p->pptr;
		if (p != NULL)
			lock(p);
		release(last_p);
	}

We should lock the new (parent) process before releasing the initiallock.


I will fix this.


The first iteration is safe if the argument is curproc (which it is), as
it clearly cannot disappear.

Turns out this traversal is even more wrong since p_pptr does not have
to be the process you are looking for - ptrace is reparenting tracees.

If the process has been reparented, then its activity will no longer bemonitored and reported by filemon.

               }
       }
       return (NULL);
}


+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+

Follow-Ups:
- Re: In-kernel process exit hooks?
  - From: Mateusz Guzik

References:
- In-kernel process exit hooks?
  - From: Paul Goyette
- Re: In-kernel process exit hooks?
  - From: David Holland
- Re: In-kernel process exit hooks?
  - From: Mateusz Guzik

Prev by Date: Re: In-kernel process exit hooks?
Next by Date: Re: In-kernel process exit hooks?
Previous by Thread: Re: In-kernel process exit hooks?
Next by Thread: Re: In-kernel process exit hooks?
Indexes:

Home | Main Index | Thread Index | Old Index