tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: VOP_GETATTR: locking protocol change proposal



On Nov 22, 2011, at 2:17 AM, YAMAMOTO Takashi wrote:

> hi,
> 
>> On Fri, Nov 18, 2011 at 06:31:21AM +0000, YAMAMOTO Takashi wrote:
>>>>> postgresql assumes instant lseek(SEEK_END) to get the size of
>>>>> their heap files.
>>>>> 
>>>>>   http://rhaas.blogspot.com/2011/11/linux-lseek-scalability.html
>>>>> 
>>>>> as fsync etc keeps the vnode lock during i/o, it might cause severe
>>>>> performance regression.
>>>> 
>>>> I wonder if it is worth having a separate VOP for that, which would
>>>> retrieve a subset of vattr without lock held.  There are potentially
>>>> more uses in the tree.
>>> 
>>> while it's good to have VOP_GETATTR take a mask to specify a set of
>>> requested attributes, i don't think it's worth to have a serparete VOP
>>> for this.  (PR/30250 is related)
>>> 
>>> IMO we should make unlocked VOP calls safe (against revoke and umount -f)
>>> and put VOP_GETATTR locking back into filesystem internal.
>> 
>> ad looked into that and concluded it was prohibitively expensive; too
>> much contention on every entry/exit.
> 
> what's "it"?
> 
> i agree naive reference counting on each VOP calls can be too expensive.

If "it" looked like the attached sketch, are two atomic adds per call too
expensive?  Could even make it cheaper if VOP_LOCK() only takes the pre-op
block, VOP_UNLOCK() only takes the post-op block and VOPs with a locked vnode
take none of them making VOP_LOCK, VOP_XXX ... VOP_UNLOCK sequences atomic
with regard to revoke.

--
Juergen Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig 
(Germany)

#define VWANTREVOKE     0x40000000

revoke()
        ...
        oc = atomic_add_32_nv(&vp->v_opcnt, VWANTREVOKE);
        while (oc != VWANTREVOKE) {
                cv_wait(&vp->v_cv, vp->v_interlock);
                oc = vp->v_opcnt;
        }
        ...
        atomic_add_32(&vp->v_opcnt, -VWANTREVOKE);
        cv_broadcast(&vp->v_cv);


vop_xxx()
        ...
        /* pre-op */
        for (;;) {
                oc = atomic_add_32_nv(&vp->v_opcnt, 1);
                if (predict_true((oc & VWANTREVOKE) == 0)
                        break;
                oc = atomic_add_32_nv(&vp->v_opcnt, -1);
                mutex_enter(&vp->v_interlock);
                while (oc & VWANTREVOKE) {
                        cv_wait(&vp->v_cv, vp->v_interlock);
                        oc = vp->v_opcnt;
                }
                mutex_exit(&vp->v_interlock);
        }
        ...
        VCALL(...)
        ...
        /* post-op */
        oc = atomic_add_32_nv(&vp->v_opcnt, -1);
        if (predict_false(oc == VWANTREVOKE)) {
                mutex_enter(&vp->v_interlock);
                cv_broadcast(&vp->v_cv);
                mutex_exit(&vp->v_interlock);
        }



Home | Main Index | Thread Index | Old Index