NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/41374: getnewvnode deadlock



hi,

> On Sun, May 10, 2009 at 09:44:41AM +0000, YAMAMOTO Takashi wrote:
> 
>> > How about making XLOCK
>> > a flag bit in v_useconut?
>> 
>> it sounds like a good idea.
> 
> On second thought what you propose looks fine to me. Putting a flag bit into
> v_usecount would have the advantage that no memory barrier would be required
> in vtryget().
> 
>> @@ -1783,8 +1794,8 @@ vclean(vnode_t *vp, int flags)
>>      KASSERT(vp->v_usecount != 0);
>>  
>>      /* If cleaning is already in progress wait until done and return. */
>> -    if (vp->v_iflag & VI_XLOCK) {
>> -            vwait(vp, VI_XLOCK);
>> +    if (vp->v_iflag & VI_CLEANING) {
>> +            vwait(vp, VI_CLEANING);
>>              return;
>>      }
> 
> Can't this allow a concurrent vclean(), e.g. via revoke or forced unmount?

VI_CLEANING is to check for concurrent vclean.

anyway i prefer your suggestion of putting a bit into v_usecount and
implemented it.  see below.

>> @@ -464,6 +464,15 @@ cache_lookup_raw(struct vnode *dvp, stru
>>              return ENOENT;
>>      }
>>      if (vtryget(vp)) {
>> +            membar_consumer();
>> +            if ((vp->v_iflag & VI_XLOCK) != 0) {
>> +                    mutex_exit(&ncp->nc_lock);
>> +                    mutex_exit(&cpup->cpu_lock);
>> +                    vrele(vp);
>> +                    COUNT(cpup->cpu_stats, ncs_falsehits);
>> +                    *vpp = NULL;
>> +                    return -1;
>> +            }
>>              mutex_exit(&ncp->nc_lock);
>>              mutex_exit(&cpup->cpu_lock);
> 
> I spent a while today trying to remember why vget() does not do vtryget().
> Of course it's due to VI_FREEING. I wonder if we could change the order of
> events in ufs_inactive() to this, allowing us to remove VI_FREEING (I don't
> know what ZFS will require):
> 
>       UFS_UPDATE(vp, ...);
>       ufs_ihashrem(ip);
>       UFS_VFREE(vp, ...);
> 
> I'm not yet sure if fsck can handle it the other way around.

wapbl might need some tweaks, too, i guess.

YAMAMOTO Takashi

Index: sys/vnode.h
===================================================================
RCS file: /cvsroot/src/sys/sys/vnode.h,v
retrieving revision 1.191.4.2
diff -u -p -r1.191.4.2 vnode.h
--- sys/vnode.h 4 May 2009 08:14:36 -0000       1.191.4.2
+++ sys/vnode.h 12 May 2009 11:02:37 -0000
@@ -250,6 +250,12 @@ typedef struct vnode vnode_t;
 #define        VSIZENOTSET     ((voff_t)-1)
 
 /*
+ * v_usecount
+ */
+#define        VC_XLOCK        0x80000000
+#define        VC_MASK         0x7fffffff
+
+/*
  * Vnode attributes.  A field value of VNOVAL represents a field whose value
  * is unavailable (getattr) or which is not to be changed (setattr).
  */
Index: kern/vfs_subr.c
===================================================================
RCS file: /cvsroot/src/sys/kern/vfs_subr.c,v
retrieving revision 1.336.4.2
diff -u -p -r1.336.4.2 vfs_subr.c
--- kern/vfs_subr.c     4 May 2009 08:13:49 -0000       1.336.4.2
+++ kern/vfs_subr.c     12 May 2009 11:02:37 -0000
@@ -361,8 +361,10 @@ try_nextlist:
         * before doing this.  If the vnode gains another reference while
         * being cleaned out then we lose - retry.
         */
-       atomic_inc_uint(&vp->v_usecount);
+       atomic_add_int(&vp->v_usecount, 1 + VC_XLOCK);
        vclean(vp, DOCLOSE);
+       KASSERT(vp->v_usecount >= 1 + VC_XLOCK);
+       atomic_add_int(&vp->v_usecount, -VC_XLOCK);
        if (vp->v_usecount == 1) {
                /* We're about to dirty it. */
                vp->v_iflag &= ~VI_CLEAN;
@@ -1229,7 +1231,7 @@ vtryget(vnode_t *vp)
                return false;
        }
        for (use = vp->v_usecount;; use = next) {
-               if (use == 0) { 
+               if (use == 0 || __predict_false((use & VC_XLOCK) != 0)) {
                        /* Need interlock held if first reference. */
                        return false;
                }
@@ -1318,9 +1320,10 @@ vtryrele(vnode_t *vp)
        u_int use, next;
 
        for (use = vp->v_usecount;; use = next) {
-               if (use == 1) { 
+               if (use == 1) {
                        return false;
                }
+               KASSERT((use & VC_MASK) > 1);
                next = atomic_cas_uint(&vp->v_usecount, use, use - 1);
                if (__predict_true(next == use)) {
                        return true;


Home | Main Index | Thread Index | Old Index