tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ioflush kernel thread chewing CPU time



Hi again,

Simon Burge wrote:

> Hi Andy,
> 
> Simon Burge wrote:
> 
> > Andrew Doran wrote:
> > 
> > > I suggest putting in some counters to see what the syncer
> > > is doing. For example:
> > > 
> > > - number VDIR vnodes flushed
> > > - number VREG vnodes flushed
> > > - number VT_VFS vnodes flushed (sync vnodes)
> > > 
> > > If you put an integer switch in the kernel you can turn the counters on at
> > > runtime using gdb, when the problem starts to occur.
> > 
> > I'll try this before trying a gprof kernel.  Actually, maybe both - I'm 
> > not worried about the performance hit of profiling on this box.
> 
> A netbsd-5 gprof kernel just reset the system as soon as it loaded/started.
> I'll dig around with that a bit more when I get a chance.
> 
> I sprinkled some event counters in sched_sync().  Over a 300 second period
> where I was seeing ioflush chewing usual 20ish% CPU time:
> 
>  - just before the while loop inside the for loop:    254
>  - at the top of the while loop:                      137
>  - after vget success:                                137
>       type VDIR:                                      0
>       type VREG:                                      12
>       type VBLK:                                      0
>       type VCHR:                                      0
>       tag VFS:                                        125
> 
> > > > Before I start digging, anyone else seen anything like this before? 
> > > 
> > > Nope. But, processing a sync vnode involves a trawl through all vnodes
> > > associated with every file system. It sounds like that could be happening
> > > too often, or for some reason perhaps vnodes on the worklist aren't 
> > > getting
> > > flushed.
> > 
> > That seems like a pretty reasonable assumption - maxvnodes is set to
> > 128k here, and dropping it to 8k sees ioflush go pretty much idle!
> > ps shows that thread now using 1.05 cpu seconds of CPU time over 60
> > seconds.  Bumping maxvnodes back to 128k still shows ioflush idle, but
> > based on past experience I guess it's not going to show a problem for 48
> > or more hours.
> 
> I've also just rebooted a kernel with your recent ffs_sync() change.  I'll
> let you know results in a day or two :-)

With that change, I'm still seeing the same problem.  Right now, ioflush
is using about 21% CPU time according to top.  With the same event
counters above, over a 300 second window I see:

 - just before the while loop inside the for loop:      273
 - at the top of the while loop:                        633
 - after vget success:                                  633
      type VDIR:                                        0
      type VREG:                                        499
      type VBLK:                                        0
      type VCHR:                                        0
      tag VFS:                                          134

This time I have a firefox running that recently had some activity, so
that might explain why we're seeing more activity in this 300 second
window than we did before.

Just to make sure, I'm running stock netbsd-5 with this patch:

Index: ffs_vfsops.c
===================================================================
RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.239.2.1
diff -d -p -u -r1.239.2.1 ffs_vfsops.c
--- ffs_vfsops.c        24 Feb 2009 04:13:35 -0000      1.239.2.1
+++ ffs_vfsops.c        23 Mar 2009 11:03:07 -0000
@@ -1738,7 +1738,8 @@ loop:
                if (vp->v_type == VREG && waitfor == MNT_LAZY) {
                        error = UFS_WAPBL_BEGIN(vp->v_mount);
                        if (!error) {
-                               error = ffs_update(vp, NULL, NULL, 0);
+                               error = ffs_update(vp, NULL, NULL,
+                                   UPDATE_CLOSE);
                                UFS_WAPBL_END(vp->v_mount);
                        }
                } else {


Cheers,
Simon.


Home | Main Index | Thread Index | Old Index