tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

ZFS vs NetBSD vnode recycling



Hi folks,

I spent yesterday with debugging zfs vnode recycling problem on i386. and I have found was is the cause of locking against myself panics which I have seen during testing zfs on i386.

In zfs mount structure there is array of mutexes[1] called z_hold_mtx which are locked by [2] function. ZFS_OBJ_HOLD_ENTER takes 2 arguments on is zfs mount structure and second one is znode(zfs inode) id. Address of a mutex is computed by adding obj_num with a number of mutexes(64). Panic happen in a case when zfs_mknode [3] hold mutex with id which points to same mutex as some another already used znode id which was already used. If this old vnode is picked by getcleanvnode->vclean->VOP_RECLAIM->zfs_netbsd_reclaim->zfs_zinactive zfs tries to lock same mutex in zfs_zinactive like it was locked in zfs_mknode.

I think that there are 3 possible solutions

1) Defer calling of zfs_zinactive, to system taskq, which can later destroy znode. I have tested this version and it works but I'm getting deadlock. I need to investigate it, if it is caused by my change or not.

2) Some sort of vreclaim patch(disable vnode recycling and just call vnalloc in getnewvnode), which will be done right this time. I'm willing to do it but I'm not sure how it should be done.

3) Do it like FreeBSD does it. They do almost the same like we in 1. But they do it differently.

static void
zfs_reclaim_complete(void *arg, int pending)
{
        znode_t *zp = arg;
        zfsvfs_t *zfsvfs = zp->z_zfsvfs;

        ZFS_LOG(1, "zp=%p", zp);
        ZFS_OBJ_HOLD_ENTER(zfsvfs, zp->z_id);
        zfs_znode_dmu_fini(zp);
        ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id);
        zfs_znode_free(zp);
}

static int
zfs_freebsd_reclaim(ap)
        struct vop_reclaim_args /* {
                struct vnode *a_vp;
                struct thread *a_td;
        } */ *ap;
{
        vnode_t *vp = ap->a_vp;
        znode_t *zp = VTOZ(vp);
        zfsvfs_t *zfsvfs;

        ASSERT(zp != NULL);

        /*
         * Destroy the vm object and flush associated pages.
         */
        vnode_destroy_vobject(vp);

        mutex_enter(&zp->z_lock);
        ASSERT(zp->z_phys);
        ZTOV(zp) = NULL;

        if (!zp->z_unlinked) {
                int locked;

                zfsvfs = zp->z_zfsvfs;
                mutex_exit(&zp->z_lock);
locked = MUTEX_HELD(ZFS_OBJ_MUTEX(zfsvfs, zp- >z_id)) ? 2 :
                    ZFS_OBJ_HOLD_TRYENTER(zfsvfs, zp->z_id);
                if (locked == 0) {
                        /*
* Lock can't be obtained due to deadlock possibility,
                         * so defer znode destruction.
                         */
TASK_INIT(&zp->z_task, 0, zfs_reclaim_complete, zp); taskqueue_enqueue(taskqueue_thread, &zp- >z_task);
                } else {
                        zfs_znode_dmu_fini(zp);
                        if (locked == 1)
                                ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id);
                        zfs_znode_free(zp);
                }
        } else {
                mutex_exit(&zp->z_lock);
        }
        VI_LOCK(vp);
        vp->v_data = NULL;
        ASSERT(vp->v_holdcnt >= 1);
        VI_UNLOCK(vp);
        return (0);
}

Do you have any suggestions ?


[1] 
http://nxr.aydogan.net/source/xref/src/external/cddl/osnet/dist/uts/common/fs/zfs/sys/zfs_vfsops.h#z_hold_mtx
[2] 
http://nxr.aydogan.net/source/xref/src/external/cddl/osnet/dist/uts/common/fs/zfs/sys/zfs_znode.h#ZFS_OBJ_HOLD_ENTER
[3] 
http://nxr.aydogan.net/xref/src/external/cddl/osnet/dist/uts/common/fs/zfs/zfs_znode.c#830

Regards

Adam.



Home | Main Index | Thread Index | Old Index