Subject: Re: reboot problems unmounting root
To: Bill Stouder-Studenmund <wrstuden@netbsd.org>
From: Antti Kantee <pooka@cs.hut.fi>
List: tech-kern
Date: 07/05/2007 20:09:12
[this is probably better suited for tech-kern]

On Thu Jul 05 2007 at 09:49:35 -0700, Bill Stouder-Studenmund wrote:
> > The problem is that nullfs passes the VOP_REVOKE operation to the
> > lower vnode.  However, the upper nullfs vnode remains entirely intact.
> > Then when vrele() is called from sys_revoke(), the upper layer vnode tries
> > to use the lock of the now-revoked lower layer vnode and goes kabloom.
> > I think the correct fix is to supply a revoke operation for nullfs &
> > layerfs, but I'm not intimate enough with them to be entirely sure that's
> > the correct fix.  At least the problem goes away using the attached patch.
> 
> The problems you're running into are why we don't really have revoke 
> processing in layerfs.
> 
> Why is the lock exploding? I think that's the real problem. As long as the 
> revoked vnode still has references, it needs to have a working lock.

The problem is that layer_bypass revokes the lower vnode and it gets
recycled.  The lower vnode now has a reference, but is generally a
deadfs vnode.  However, the upper layer isn't revoked, or neither does
it think it is revoked/reclaimed.  When it tries to use the now-nuked
lower layer's exported lock, boooom like that.

> Blowing away the upper node is not necessary, and it doesn't lead to 
> correct functioning. The issue is that you can have layer stacks that are 
> more complicated than just one layer above a leaf file system. You can 
> have more than one layer node above the same leaf node. As such, the node 
> underneath you can ALWAYS get revoked out from under you. Given that, we 
> have to be able to handle the lower node getting revoked, and once we do 
> that, we don't need to zap the layer node the revoke goes through.

I see, you're worried about the hamburger effect: the beef is revoked
but the upper bun does not see it.

Here's actually another way to repeat the same problem (which is not
fixed by the proposed patch):

touch /upper/foo
sleep 10 < /upper/foo &
revoke /lower/foo
*wait for an earth-shattering kaboom*

> So change how revoke happens. Rip the inode off of the vnode, but don't 
> kill the lock.

How do I not kill the lock?  The vnode is reclaimed.  It won't be
re-reclaimed after this.  If it's the lower layer, it doesn't even know
about the upper one, right?  If vnode locks were separate and had separate
reference counts, then maybe, but ...

-- 
Antti Kantee <pooka@iki.fi>                     Of course he runs NetBSD
http://www.iki.fi/pooka/                          http://www.NetBSD.org/
    "la qualité la plus indispensable du cuisinier est l'exactitude"