tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: serious performance regression in .41



On 23 May 2014, at 20:15, Christos Zoulas <christos%astron.com@localhost> wrote:

> In article <DFC486AF-CA51-4EC2-B132-CD5E06B039DF%eis.cs.tu-bs.de@localhost>,
> J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost> wrote:
>> 
>> Ok, so I tried a kernel with DEBUG+LOCKDEBUG and get the same problems
>> you describe:  every 10 to 20 seconds I get hickups of 1 to 2 seconds.
>> 
>> Christos, just to make it clear:  Do you have these problems when
>> running a GENERIC or better GENERIC -DIAGNOSTICS kernel?
> 
> The hiccups are much less severe, but there are still performance
> issues. The machine is a lot slower building than it used to be.
> I can try to get precise numbers over the weekend if you want.

Taken without numbers.

> BTW, I tried your patch and the sync stats are a lot better
> now, but I still get the 1+ second hiccups. I don't know what part
> of the code is causing them yet.
> 
> I still feel that for low power machines the filtering iterator is
> a much better answer, since it avoids a lot of extra work. Why
> lock and unlock the mntvnode lock and vget() something you are not
> going to use. At least add the ability via flags to skip the
> uninteresting vnodes in the iterator.

Ok.

> Finally, before the changes I could run a LOCKDEBUG kernel in
> production.  With the vnode iterator, this is now impossible. This
> is a serious regression in itself, so even if the performance issues
> are resolved in a non-debug kernel, I still consider the change
> problematic.

Ok, see me convinced.

First, selector and context should be passed to vfs_vnode_iterator_init
as they are invariant.  We may (ab)use v_mount and v_data to avoid
allocating a state like we do with v_usecount now.

I have no opinion on the return type of vfs_vnode_iterator_next - both
bool/vpp and returning a vnode are ok to me.

> Index: sys/kern/vfs_mount.c
> ===================================================================
> RCS file: /cvsroot/src/sys/kern/vfs_mount.c,v
<snip>
>               mutex_enter(vp->v_interlock);
> +             if (f && !(*f)(cl, vp)) {
> +                     mutex_exit(vp->v_interlock);
> +                     vp = TAILQ_NEXT(vp, v_mntvnodes);
> +                     goto again;
> +             }

This cannot work, the selector would run on reclaiming and
marker nodes.  Change the while() loop to

        while (ISSET(vp->v_iflag, VI_MARKER) ||
            ISSET(vp->v_iflag, VI_XLOCK) ||
            (f && !(*f)(cl, vp))) {

No need to test for VI_CHANGING as the selector runs with
v_interlock held.

> +
>               while ((vp->v_iflag & VI_MARKER) != 0) {
>                       mutex_exit(vp->v_interlock);
>                       vp = TAILQ_NEXT(vp, v_mntvnodes);
>                       if (vp == NULL) {
>                               mutex_exit(&mntvnode_lock);
> -                             *vpp = NULL;
> -                             return false;
> +                             return NULL;
>                       }
>                       mutex_enter(vp->v_interlock);
>               }
<snip>
> Index: sys/nfs/nfs_subs.c
> ===================================================================
> RCS file: /cvsroot/src/sys/nfs/nfs_subs.c,v
<snip>
> +     ctx.mp = mp;
> +     while ((vp = vfs_vnode_iterator_next(marker, nfs_clearcommit_selector,
> +         &ctx))) {
>               vrele(vp);
>       }

Change to

        vp = vfs_vnode_iterator_next(...);
        KASSERT(vp == NULL);

as the filter always returns false?

For ffs accessing the inode is safe (see ffs_reclaim).  I didn't check
the other uses though.

... and as always: Anita run, kernel version bump and man page update.

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig 
(Germany)



Home | Main Index | Thread Index | Old Index