Source-Changes archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

CVS commit: src/sys



Module Name:    src
Committed By:   thorpej
Date:           Sun Oct 10 18:07:52 UTC 2021

Modified Files:
        src/sys/kern: kern_event.c kern_exec.c kern_exit.c kern_fork.c
        src/sys/sys: event.h eventvar.h proc.h

Log Message:
Changes to make EVFILT_PROC MP-safe:

Because the locking protocol around processes is somewhat complex
compared to other events that can be posted on kqueues, introduce
new functions for posting NOTE_EXEC, NOTE_EXIT, and NOTE_FORK,
rather than just using the generic knote() function.  These functions
KASSERT() their locking expectations, and deal with other complexities
for each situation.

knote_proc_fork(), in particiular, needs to handle NOTE_TRACK, which
requires allocation of a new knote to attach to the child process.  We
don't want to be allocating memory while holding the parent's p_lock.
Furthermore, we also have to attach the tracking note to the child
process, which means we have to acquire the child's p_lock.

So, to handle all this, we introduce some additional synchronization
infrastructure around the 'knote' structure:

- Add the ability to mark a knote as being in a state of flux.  Knotes
  in this state are guaranteed not to be detached/deleted, thus allowing
  a code path drop other locks after putting a knote in this state.

- Code paths that wish to detach/delete a knote must first check if the
  knote is in-flux.  If so, they must wait for it to quiesce.  Because
  multiple threads of execution may attempt this concurrently, a mechanism
  exists for a single LWP to claim the detach responsibility; all other
  threads simply wait for the knote to disappear before they can make
  further progress.

- When kqueue_scan() encounters an in-flux knote, it simply treats the
  situation just like encountering another thread's queue marker -- wait
  for the flux to settle and continue on.

(The "in-flux knote" idea was inspired by FreeBSD, but this works differently
from their implementation, as the two kqueue implementations have diverged
quite a bit.)

knote_proc_fork() uses this infrastructure to implement NOTE_TRACK like so:

- Attempt to put the original tracking knote into a state of flux; if this
  fails (because the note has a detach pending), we skip all processing
  (the original process has lost interest, and we simply won the race).

- Once the note is in-flux, drop the kq and forking process's locks, and
  allocate 2 knotes: one to post the NOTE_CHILD event, and one to attach
  a new NOTE_TRACK to the child process.  Notably, we do NOT go through
  kqueue_register() to do this, but rather do all of the work directly
  and KASSERT() our assumptions; this allows us to directly control our
  interaction with locks.  All memory allocations here are performed with
  KM_NOSLEEP, in order to prevent holding the original knote in-flux
  indefinitely.

- Because the NOTE_TRACK use case adds knotes to kqueues through a
  sort of back-door mechanism, we must serialize with the closing of
  the destination kqueue's file descriptor, so steal another bit from
  the kq_count field to notify other threads that a kqueue is on its
  way out to prevent new knotes from being enqueued while the close
  path detaches them.

In addition to fixing EVFILT_PROC's reliance on KERNEL_LOCK, this also
fixes a long-standing bug whereby a NOTE_CHILD event could be dropped
if the child process exited before the interested process received the
NOTE_CHILD event (the same knote would be used to deliver the NOTE_EXIT
event, and would clobber the NOTE_CHILD's 'data' field).

Add a bunch of comments to explain what's going on in various critical
sections, and sprinkle additional KASSERT()s to validate assumptions
in several more locations.


To generate a diff of this commit:
cvs rdiff -u -r1.128 -r1.129 src/sys/kern/kern_event.c
cvs rdiff -u -r1.509 -r1.510 src/sys/kern/kern_exec.c
cvs rdiff -u -r1.291 -r1.292 src/sys/kern/kern_exit.c
cvs rdiff -u -r1.226 -r1.227 src/sys/kern/kern_fork.c
cvs rdiff -u -r1.43 -r1.44 src/sys/sys/event.h
cvs rdiff -u -r1.9 -r1.10 src/sys/sys/eventvar.h
cvs rdiff -u -r1.368 -r1.369 src/sys/sys/proc.h

Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.




Home | Main Index | Thread Index | Old Index