[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: ioflush kernel thread chewing CPU time
Simon Burge wrote:
> Andrew Doran wrote:
> > I suggest putting in some counters to see what the syncer
> > is doing. For example:
> > - number VDIR vnodes flushed
> > - number VREG vnodes flushed
> > - number VT_VFS vnodes flushed (sync vnodes)
> > If you put an integer switch in the kernel you can turn the counters on at
> > runtime using gdb, when the problem starts to occur.
> I'll try this before trying a gprof kernel. Actually, maybe both - I'm
> not worried about the performance hit of profiling on this box.
A netbsd-5 gprof kernel just reset the system as soon as it loaded/started.
I'll dig around with that a bit more when I get a chance.
I sprinkled some event counters in sched_sync(). Over a 300 second period
where I was seeing ioflush chewing usual 20ish% CPU time:
- just before the while loop inside the for loop: 254
- at the top of the while loop: 137
- after vget success: 137
type VDIR: 0
type VREG: 12
type VBLK: 0
type VCHR: 0
tag VFS: 125
> > > Before I start digging, anyone else seen anything like this before?
> > Nope. But, processing a sync vnode involves a trawl through all vnodes
> > associated with every file system. It sounds like that could be happening
> > too often, or for some reason perhaps vnodes on the worklist aren't getting
> > flushed.
> That seems like a pretty reasonable assumption - maxvnodes is set to
> 128k here, and dropping it to 8k sees ioflush go pretty much idle!
> ps shows that thread now using 1.05 cpu seconds of CPU time over 60
> seconds. Bumping maxvnodes back to 128k still shows ioflush idle, but
> based on past experience I guess it's not going to show a problem for 48
> or more hours.
I've also just rebooted a kernel with your recent ffs_sync() change. I'll
let you know results in a day or two :-)
Main Index |
Thread Index |