Subject: Re: raidctl panic
To: Antti Kantee <pooka@cs.hut.fi>
From: Greg Oster <oster@cs.usask.ca>
List: tech-kern
Date: 12/11/2007 09:33:30
Antti Kantee writes:
> On Mon Dec 10 2007 at 13:31:24 -0600, Greg Oster wrote:
> > Manuel Bouyer writes:
> > > On Thu, Dec 06, 2007 at 01:36:16PM -0600, Greg Oster wrote:
> > > > Manuel Bouyer writes:
> > > > >=20
> > > > > Reading the code I understand why we get a panic (rf_ConfigureD=
isk pa
> ss
> > > > > raidPtr->engine_thread to dk_lookup()), the question is what wo=
uld be
> > > > > the fix ...
> > > >=20
> > > > I'd like to know too....  IIRC, the code was changed (some years =
ago)
> > > > to what it is now because curproc->lwp0->cred (or some such) bein=
g pass
> ed
> > > > in wasn't sufficient to match the credentials of the kernel threa=
d. =20
> > > > If it is now the case that l->l_cred =3D=3D curlwp->l_cred =3D=3D=
=20
> > > > some_other_kernel_lwp->l_cred , then we probably don't need to be=
=20
> > > > passing 'engine_thread' into dk_lookup() anymore, and just curlwp=
=20
> > > > will be sufficient...=20
> > >=20
> > > I think it's that dk_lookup() is now more strict than it was. You c=
an't
> > > call it with other credential than curproc(). Maybe it's the check =
in
> > > dk_lookup() which is wrong, I don't know ...
> > > Note that once the raid is setup and use autoconf, all works fine .=
..
> >=20
> > Antti made some changes to nuke the passing of 'l' everywhere,=20
> > including to nuke the lwp parameter from NDINIT().   Has that=20
> > changed/fixed this panic at all?
>=20
> I added the KASSERT to NDINIT for a few weeks to check if there were
> legitimate uses of doing NDINIT with l =21=3D curlwp.  I couldn't think=
 of
> any but it sure wasn't possible to check it by code reading.
>=20
> rf_ConfigureDisk is passing the rf thread lwp as a parameter to
> dk_lookup().  I don't think that is right as you end up doing credentia=
l
> checks with the rf thread credentials instead of the configuring thread=

> credentials.  Granted, the necessary checks for the user credentials ar=
e
> probably already earlier in the callstack.  So, even though it doesn't
> panic anymore, I still think it's wrong as dk_lookup() uses l->l_cred
> in other places and should be passed curlwp (as was noted in the quoted=

> discussion).

So way =22back in the day=22 before dk_lookup() was used, RAIDframe had=20
its own lookup routine.  And the issue way back then was consistency=20
of credentials in the following situations:

 1) Attempting to close a component (e.g. for a reconstruct initiated=20
via raidctl) that had been opened by the kernel (e.g. in autoconfig).

 2) Attempting to close a component (e.g. on raid shutdown at system=20
shutdown time) that had been opened via running raicctl (e.g. after a=20
hot-add and reconstruct).

 3) Attempting to close a component by raidctl (e.g. on a raid=20
unconfigure) where some components were opened by the kernel,
and others by raidctl.

 4) There are other combinations/flavours of this sort of thing.

The curproc->cred (or whatever it was called then) didn't match up=20
properly in all these cases, leading to panics due to mis-matching=20
credentials (e.g. 'root's credentials when running raidctl wern't=20
matching whatever kernel credentials were used to open the component=20
device, and the system would panic).  One solution was to basically=20
use the credentials of the engine_thread in all cases.=20

I'm happy to pass curlwp instead of engine_thread, but the real=20
requirement is to be able to do the opens/closes of the various=20
component devices at all stages without the kernel panicing due to
'invalid credentials'...  And I havn't verified that any of those=20
will be happy yet...=20

Later...

Greg Oster