Subject: Re: raidctl panic
To: Greg Oster <oster@cs.usask.ca>
From: Antti Kantee <pooka@cs.hut.fi>
List: tech-kern
Date: 12/11/2007 13:39:43
On Mon Dec 10 2007 at 13:31:24 -0600, Greg Oster wrote:
> Manuel Bouyer writes:
> > On Thu, Dec 06, 2007 at 01:36:16PM -0600, Greg Oster wrote:
> > > Manuel Bouyer writes:
> > > > Hi,
> > > > running raidctl -c on current with debug options I get:
> > > > panic: kernel diagnostic assertion "(l) == curlwp" failed: file "/dsk/l1/
> > misc
> > > > /bouyer/current/src/sys/dev/dksubr.c", line 640
> > > > Stopped in pid 8.1 (raidctl) at netbsd:breakpoint+0x1:  ret
> > > > breakpoint() at netbsd:breakpoint+0x1
> > > > __kernassert() at netbsd:__kernassert+0x2d
> > > > dk_lookup() at netbsd:dk_lookup+0x198
> > > > rf_ConfigureDisk() at netbsd:rf_ConfigureDisk+0x10a
> > > > rf_ConfigureDisks() at netbsd:rf_ConfigureDisks+0xb9
> > > > rf_Configure() at netbsd:rf_Configure+0x6fc
> > > > raidioctl() at netbsd:raidioctl+0xf05
> > > > VOP_IOCTL() at netbsd:VOP_IOCTL+0x2e
> > > > vn_ioctl() at netbsd:vn_ioctl+0x79
> > > > sys_ioctl() at netbsd:sys_ioctl+0x1bc
> > > > syscall() at netbsd:syscall+0xe5
> > > > 
> > > > Reading the code I understand why we get a panic (rf_ConfigureDisk pass
> > > > raidPtr->engine_thread to dk_lookup()), the question is what would be
> > > > the fix ...
> > > 
> > > I'd like to know too....  IIRC, the code was changed (some years ago)
> > > to what it is now because curproc->lwp0->cred (or some such) being passed
> > > in wasn't sufficient to match the credentials of the kernel thread.  
> > > If it is now the case that l->l_cred == curlwp->l_cred == 
> > > some_other_kernel_lwp->l_cred , then we probably don't need to be 
> > > passing 'engine_thread' into dk_lookup() anymore, and just curlwp 
> > > will be sufficient... 
> > 
> > I think it's that dk_lookup() is now more strict than it was. You can't
> > call it with other credential than curproc(). Maybe it's the check in
> > dk_lookup() which is wrong, I don't know ...
> > Note that once the raid is setup and use autoconf, all works fine ...
> 
> Antti made some changes to nuke the passing of 'l' everywhere, 
> including to nuke the lwp parameter from NDINIT().   Has that 
> changed/fixed this panic at all?

I added the KASSERT to NDINIT for a few weeks to check if there were
legitimate uses of doing NDINIT with l != curlwp.  I couldn't think of
any but it sure wasn't possible to check it by code reading.

rf_ConfigureDisk is passing the rf thread lwp as a parameter to
dk_lookup().  I don't think that is right as you end up doing credential
checks with the rf thread credentials instead of the configuring thread
credentials.  Granted, the necessary checks for the user credentials are
probably already earlier in the callstack.  So, even though it doesn't
panic anymore, I still think it's wrong as dk_lookup() uses l->l_cred
in other places and should be passed curlwp (as was noted in the quoted
discussion).

In other news, dk_lookup() doesn't really need l for anything.

-- 
Antti Kantee <pooka@iki.fi>                     Of course he runs NetBSD
http://www.iki.fi/pooka/                          http://www.NetBSD.org/
    "la qualité la plus indispensable du cuisinier est l'exactitude"