Re: specfs/spec_vnops.c diagnostic assertion panic

To: Robert Elz <kre%munnari.OZ.AU@localhost>
Subject: Re: specfs/spec_vnops.c diagnostic assertion panic
From: Taylor R Campbell <campbell+netbsd-current-users%mumble.net@localhost>
Date: Fri, 12 Aug 2022 23:35:26 +0000

> Date: Sat, 13 Aug 2022 05:59:01 +0700
> From: Robert Elz <kre%munnari.OZ.AU@localhost>
> 
> Why is fsck running on the block device though?   And devpubd too?   Given
> the reference to cdev_close() I'd assumed it was a char (raw) device that
> was being used, which would be what fsck certainly should be using.  But
> I see that 02-wedgenames uses the block /dev/dnN device .. should fix that.)

This part of my hypothesis was not quite right -- it looks like there
is no race between fsck and dkctl per se, because they both use the
character device like they should as confirmed by ktrace.

Revised hypothesis:

- fsck opens /dev/rdkN, and
- dkctl opens /dev/rdkM,

where dkN and dkM are different wedges on the same parent.

02-wedgenames doesn't do `dkctl raid0 listwedges'; rather, it does
`dkctl dkN getwedgeinfo', which opens the character device /dev/rdkN
as it should.  And fsck is still opening a character device as it
should.

Why is this a problem?  When you open or close /dev/rdkN or /dev/rdkM,
that goes through dkopen and dkclose in the kernel.  On first open,
dkopen opens the parent disk's _block device_ via dk_open_parent, and
on last close, dkclose closes it via dklastclose -> dk_close_parent.

However, while dkclose -> dklastclose -> dk_close_parent is in
progress, currently nothing prevents a concurrent dkopen on another
wedge of the same parent from trying to open the parent afresh.

And although dk_open_parent is serialized by the mutex dk_rawlock,
dk_close_parent was _deliberately_ made to run outside dk_rawlock back
in 2010:

https://mail-index.netbsd.org/source-changes/2010/08/04/msg012270.html
https://mail-index.netbsd.org/tech-kern/2010/07/27/msg008612.html

Judging by the stack trace there (mutex_destroy -> [presumably
disk_destroy elided by tail call optimization] -> raidclose) was
actually evidence of a bug that had already been fixed a year earlier,
but possibly not in the kernel bouyer@ was running:

https://mail-index.netbsd.org/source-changes/2009/07/23/msg223381.html

This change by dyoung@ moved the disk_destroy call from raidclose to
raid_detach where it more properly belongs.  Note that today there has
been _another_ call to dk_close_parent, in dkwedge_read, for the past
seven years.

So I think we should move the call to dk_close_parent in dklastclose
back under dk_rawlock -- and also under dk_openlock.  This will allow
us to refactor the code to keep the mutex_enter/exit bracketing much
more obvious.

(Side note: I think we should eliminate dk_rawlock and just use
dk_openlock; they appear to be redundant, except in dkwedge_read, but
I'm not sure what logic uses _only_ dk_openlock and _not_ dk_rawlock
which is fruitfully concurrent with dkwedge_read.  Certainly
dkwedge_discover can't be called with either one held, and I don't see
why concurrency between dkwedge_read/add/list/delall/detach is
important.)

Can you try _reverting_ specfs_blockopen.patch, and _applying_ the
attached dkopenclose.patch, and see if you can reproduce any crash?

(I believe specfs_blockopen.patch is still necessary for other reasons
but it gets in the way of testing my hypothesis that dk_close_parent
needs to be serialized with dk_open_parent anyway.)

From 5eaacbd7d9d6d6b3ccfea5564aebf9548f5b7564 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Date: Fri, 12 Aug 2022 22:48:25 +0000
Subject: [PATCH] dk(9): Serialize closing parent's dk_rawvp with opening it.

Otherwise, the following events might happen:

- process 123 had /dev/rdkN open, starts close, enters dk_close_parent
- process 456 opens /dev/rdkM (same parent, different wedge), calls
  dk_open_parent

At this point, the block device hasn't yet closed, so dk_open_parent
will fail with EBUSY.  This is incorrect -- the chardev is never
supposed to fail with EBUSY, and dkopen/dkclose carefully manage
state to avoid opening the block device while it's still open.  The
problem is that dkopen in process 456 didn't wait for vn_close
in process 123 to finish before calling VOP_OPEN.

(Note: If it were the _same_ chardev /dev/rdkN in both processes,
then spec_open/close would prevent this.  But since it's a
_different_ chardev, spec_open/close assume that concurrency is OK,
and it's the driver's responsibility to serialize access to the
parent disk which, unbeknownst to spec_open/close, is shared between
dkN and dkM.)

It appears that the vn_close call was previously moved outside
dk_rawlock in 2010 to work around an unrelated bug in raidframe that
had already been fixed in HEAD:

Crash pointing to dk_rawlock and raidclose:
https://mail-index.netbsd.org/tech-kern/2010/07/27/msg008612.html

Change working around that crash:
https://mail-index.netbsd.org/source-changes/2010/08/04/msg012270.html

Change removing raidclose -> mutex_destroy(&dk_rawlock) path:
https://mail-index.netbsd.org/source-changes/2009/07/23/msg223381.html
---
 sys/dev/dkwedge/dk.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/sys/dev/dkwedge/dk.c b/sys/dev/dkwedge/dk.c
index 97430e76da19..5282da129e45 100644
--- a/sys/dev/dkwedge/dk.c
+++ b/sys/dev/dkwedge/dk.c
@@ -1220,13 +1220,13 @@ dklastclose(struct dkwedge_softc *sc)
 		}
 	}

-	mutex_exit(&sc->sc_parent->dk_rawlock);
-	mutex_exit(&sc->sc_dk.dk_openlock);
-
 	if (vp) {
 		dk_close_parent(vp, mode);
 	}

+	mutex_exit(&sc->sc_parent->dk_rawlock);
+	mutex_exit(&sc->sc_dk.dk_openlock);
+
 	return error;
 }

Follow-Ups:
- Re: specfs/spec_vnops.c diagnostic assertion panic
  - From: Robert Elz
- Re: specfs/spec_vnops.c diagnostic assertion panic
  - From: Robert Elz
- Re: specfs/spec_vnops.c diagnostic assertion panic
  - From: Robert Elz

References:
- Re: specfs/spec_vnops.c diagnostic assertion panic
  - From: Robert Elz

Prev by Date: Re: specfs/spec_vnops.c diagnostic assertion panic
Next by Date: daily CVS update output
Previous by Thread: Re: specfs/spec_vnops.c diagnostic assertion panic
Next by Thread: Re: specfs/spec_vnops.c diagnostic assertion panic
Indexes:

Home | Main Index | Thread Index | Old Index