tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: closing pad(4) before audio(4) causes panic

> Date: Sun, 02 Feb 2020 14:25:12 +0900
> From: Tetsuya Isaki <>
> According to audio.c comment, vdevgone() calls close().  But it is
> not called actually.

vdevgone calls VOP_REVOKE on the vnode, which _used_ to call
audio_close, but it does essentially nothing for cloning file
descriptors as we have used since sys/dev/audio.c r1.302.

Instead of vdevgone,

- audiodetach must go through sc->sc_files and render them invalid,

- audioread, audiowrite, &c. -- all the audio_fileops -- must be able
  to handle an invalid file gracefully by returning error.

But it's tricky because audioread and audiowrite need to get a handle
on the softc, without holding any locks, while audiodetach may be
racing to destroy the softc.

Here's one way you could do it, using pserialize(9) in
audioread/audiowrite to reliably tell whether the file is still valid,
and psref(9) to get a handle on the softc:

1. Add struct pserialize * and struct psref_target to struct
   audio_softc so we can passive references to it:

	struct audio_softc {
		struct pserialize	*sc_psz;
		struct psref_target	sc_psref;

   (Create/destroy an audio psref class in MODULE_CMD_INIT/FINI, and
   then do pserialize_create/destroy and psref_target_init/destroy in

2. In audioread, audiowrite, &c., get the sc and acquire a psref to it
   in a pserialize read section:

	static int
	audioread(struct file *fp, ...)
		struct audio_file *file = fp->f_audioctx;
		struct audio_softc *sc;
		struct psref sc_ref;
		int error;
		int s;

		/* Block audiodetach while we acquire a reference.  */
		s = pserialize_read_enter();

		/* If audiodetach already ran, tough -- no more audio.  */
		if ((sc = atomic_load_relaxed(&file->sc)) == NULL) {
			return ENXIO;

		/* Acquire a reference.  */
		psref_acquire(&sc_ref, &sc->sc_psref, audio_psref_class);

		/* Now sc won't go away until we drop the reference count.  */

		... use sc->sc_lock, do the read, &c. ...
		... make sure sc_dying prevents long waits ...

		/* Release the reference.  */
		psref_release(&sc_ref, &sc->sc_psref, audio_psref_class);

		return error;

3. In audio_close, remove it from the list _only_ if it's still valid.

	audio_close(sc, file)
		/* Detach races to remove us from the list too.  */
		if (file->sc) {
			KASSERT(sc == file->sc);
			SLIST_REMOVE(&sc->sc_files, file, audio_file, entry);
			file->sc = NULL; /* paranoia */

3. In audiodetach, you need to (a) prevent new users, (b) wait for
   existing users to drain, (c) free it.  So, after setting
   sc->sc_dying, and in the place of the logic that currently does

		struct audio_file *file;
		/* (a) Prevent new users.  */
		while ((file = SLIST_FIRST(&sc->sc_files)) != NULL) {
			SLIST_REMOVE_HEAD(&sc->sc_files, entry);
			atomic_store_relaxed(&file->sc, NULL);

		 * (b) Wait for existing users to drain:
		 *     - pserialize_perform waits for all
		 *       pserialize_read sections on all CPUs; after
		 *       this, no more new psref_acquire can happen.
		 *     - psref_target_destroy waits for all extant
		 *       acquired psrefs to be psref_released.
		psref_target_destroy(&sc->sc_psref, audio_psref_class);

		 * We are now guaranteed that there are no calls to
		 * audio fileops that hold sc, and any new calls with
		 * files that were for sc will fail.  Thus, we now
		 * have exclusive access to the softc.

Side notes after reviewing audio.c:

- You will need to find a way to ensure that audio_close can free all
  of its resources unconditionally: although it can technically return
  an error code, it can't fail to free resources; close is final and
  can never be retried.  The file descriptor will be freed by closef
  even if .fo_close returns an error code.

  If audio_close really really needs to hold the long-term exclusive
  lock (sc_exlock), then you need to provide a way to interrupt anyone
  holding the exclusive lock so that audio_close can proceed without
  blocking indefinitely.  But if there's any way to avoid taking the
  long-term exclusive lock at all, that would be better.

- kmem_alloc/free (or any other sleeping allocation -- for example,
  softint_establish/disestablish) is not allowed while holding a lock.
  It may work accidentally sometimes, but it's wrong for audio to do
  -- it can lead to deadlocks, uninterruptible blocking waits, &c.

- KASSERT(!mutex_owned(lock)) is not generally allowed.  If you're
  about to do mutex_enter(lock) anyway, just skip the kassert --
  mutex_enter will detect a recursive lock and panic itself.

Home | Main Index | Thread Index | Old Index