tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: serious performance regression in .41
On 23 May 2014, at 20:15, Christos Zoulas <christos%astron.com@localhost> wrote:
> In article <DFC486AF-CA51-4EC2-B132-CD5E06B039DF%eis.cs.tu-bs.de@localhost>,
> J. Hannken-Illjes <hannken%eis.cs.tu-bs.de@localhost> wrote:
>>
>> Ok, so I tried a kernel with DEBUG+LOCKDEBUG and get the same problems
>> you describe: every 10 to 20 seconds I get hickups of 1 to 2 seconds.
>>
>> Christos, just to make it clear: Do you have these problems when
>> running a GENERIC or better GENERIC -DIAGNOSTICS kernel?
>
> The hiccups are much less severe, but there are still performance
> issues. The machine is a lot slower building than it used to be.
> I can try to get precise numbers over the weekend if you want.
Taken without numbers.
> BTW, I tried your patch and the sync stats are a lot better
> now, but I still get the 1+ second hiccups. I don't know what part
> of the code is causing them yet.
>
> I still feel that for low power machines the filtering iterator is
> a much better answer, since it avoids a lot of extra work. Why
> lock and unlock the mntvnode lock and vget() something you are not
> going to use. At least add the ability via flags to skip the
> uninteresting vnodes in the iterator.
Ok.
> Finally, before the changes I could run a LOCKDEBUG kernel in
> production. With the vnode iterator, this is now impossible. This
> is a serious regression in itself, so even if the performance issues
> are resolved in a non-debug kernel, I still consider the change
> problematic.
Ok, see me convinced.
First, selector and context should be passed to vfs_vnode_iterator_init
as they are invariant. We may (ab)use v_mount and v_data to avoid
allocating a state like we do with v_usecount now.
I have no opinion on the return type of vfs_vnode_iterator_next - both
bool/vpp and returning a vnode are ok to me.
> Index: sys/kern/vfs_mount.c
> ===================================================================
> RCS file: /cvsroot/src/sys/kern/vfs_mount.c,v
<snip>
> mutex_enter(vp->v_interlock);
> + if (f && !(*f)(cl, vp)) {
> + mutex_exit(vp->v_interlock);
> + vp = TAILQ_NEXT(vp, v_mntvnodes);
> + goto again;
> + }
This cannot work, the selector would run on reclaiming and
marker nodes. Change the while() loop to
while (ISSET(vp->v_iflag, VI_MARKER) ||
ISSET(vp->v_iflag, VI_XLOCK) ||
(f && !(*f)(cl, vp))) {
No need to test for VI_CHANGING as the selector runs with
v_interlock held.
> +
> while ((vp->v_iflag & VI_MARKER) != 0) {
> mutex_exit(vp->v_interlock);
> vp = TAILQ_NEXT(vp, v_mntvnodes);
> if (vp == NULL) {
> mutex_exit(&mntvnode_lock);
> - *vpp = NULL;
> - return false;
> + return NULL;
> }
> mutex_enter(vp->v_interlock);
> }
<snip>
> Index: sys/nfs/nfs_subs.c
> ===================================================================
> RCS file: /cvsroot/src/sys/nfs/nfs_subs.c,v
<snip>
> + ctx.mp = mp;
> + while ((vp = vfs_vnode_iterator_next(marker, nfs_clearcommit_selector,
> + &ctx))) {
> vrele(vp);
> }
Change to
vp = vfs_vnode_iterator_next(...);
KASSERT(vp == NULL);
as the filter always returns false?
For ffs accessing the inode is safe (see ffs_reclaim). I didn't check
the other uses though.
... and as always: Anita run, kernel version bump and man page update.
--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
(Germany)
Home |
Main Index |
Thread Index |
Old Index