tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: crash in tmpfs_itimes since vmlocking2 merge



On Wed, Jan 23, 2008 at 01:24:39PM -0800, Bill Stouder-Studenmund wrote:
> On Wed, Jan 23, 2008 at 10:55:13PM +0200, Antti Kantee wrote:
> > On Wed Jan 23 2008 at 12:49:09 -0800, Bill Stouder-Studenmund wrote:
> > > > This is the problem of forcibly unmounting a file system with device
> > > > vnodes on it.  There was a small discussion about splitting specfs
> > > > into two layers: one for the "host" file system (tmpfs in this case)
> > > > and one for the actual device info (specfs).  Then you could rip the
> > > > "host" part out without affecting the device part before the device goes
> > > > inactive and can be safely put to rest.
> > > 
> > > I'd actually say this is a bug in deadfs or tmpfs. Since we know the
> > > vnode's still in use, we should leave one around that doesn't cause the 
> > > kernel to crash.
> > 
> > We don't make device vnodes deadfs so that they would still work.  And I
> 
> I think then we should. :-)
> 
> > don't think teaching every file system to operate without its backing data
> > structures is very feasible.  Hence, split it into two pieces so we don't
> > have to worry about it: the "frontend" (.e.g tmpfs) is deadfs'd while the
> > "backend" (specfs) continues operation until refcount drops to 0.

An alternative for device nodes only is to implement a devfs and make
devices meaningless when they appear on other types of file system. I
hear that FreeBSD did that.
 
> Haven't we already done that split w/ specfs now? All we need to do is add 
> a deadfs operation vector (and routines, maybe) for devices, and use it 
> when we deaden a device.
> 
> "deadfs" means your file system is dead. We don't necessarily ned it to 
> mean that your vnode is dead. And for devices which are mounted, chances 
> are we don't want that meaning. :-)
> 
> The one thing I see that might be tricky is to make things work right w/ 
> revoke(2). If you revoke something, you need a dead device vnode that does 
> nothing. If however we unmount /dev, you want a deadfs device vnode that 
> still does i/o. Character devices always want a dead node, but we need 
> some differentiation between "dead" and "orphaned" block devices.

The major problem I see is that changing v_op on a live vnode is nuts,
particularly with the file system now being fully multithreaded. It could
probably be solved by adding an additonal layer of gates into VOP access.
Runtime cost aside that would mean understanding a bunch of complex
interactions.

I would prefer to see locking pushed down to the inode level, with the file
system dealing with revoked vnodes. It would mean maintaining a zombie
in-core inode after a vnode has been forcibly revoked, with the inode
structure having no references to it other than from the vnode that has been
revoked.  Your average file system VOP not wanting to do something fancy
would call genfs_lock() *. The genfs_node would contain a rwlock replacing
the current vnode lock, and would have a flag to record if the vnode has
been revoked. If it's gone then genfs_lock() would return EBADF. The zombie
inode would only go away when the reference count on the vnode drops to
zero.

Andrew

*  something like:

int
genfs_lock(vnode_t *vp, krw_t op)
{
        struct genfs_node *gp = VTOG(vp);

        rw_enter(&gp->g_inolock, op);
        if (gp->g_gone) {
                rw_exit(&gp->g_inolock);
                return EBADF;
        }
        return 0;
}



Home | Main Index | Thread Index | Old Index