Subject: Re: 3.0.1: softdep + ffsv2 + 'heavy' load = pauses
To: None <netbsd-users@netbsd.org>
From: Mark Cullen <mark.r.cullen@gmail.com>
List: netbsd-users
Date: 07/22/2006 19:53:06
Greg A. Woods wrote:
> At Fri, 21 Jul 2006 15:16:57 +0100,
> Mark Cullen wrote:
>> The solution, so far, appears to be to just turn off softdep. I'm
>> seeing no pauses now with either of the tests, SSH on the dbench test
>> logs in nice and quick.. as if the machine were still idle! Though I
>> am sure turning off softdep has a horrible performance impact, so I
>> would really like to use it.
> 
> Why don't you actually measure the final system throughput with and
> without softdep?  I.e. time the start to completion of the whole test,
> once with softdep and once without.  Don't do _anything_ else on the
> system while running these tests.  Do them in single user mode if
> possible (run "/etc/rc.d/network start" and _only_ whatever else is
> needed to get NFS running for the NFS test).

I shall at some point. Going single user would be a bit of hassle, so it 
might have to be a multi user test. Regardless of the results, I am 
forced to run without softdep anyway so...

> 
>> On a side note, I tried running `dbench` on the home server with just
>> 20 clients and it totally hung the machine when it got to the
>> "executing" stage of the test. I'm not terribly sure why either.
> 
> You need to pay very close attention to how all your system's resources
> are being used during such tests.  Run trials with fewer jobs and see
> how well things are behaving.  You'll probably need to run several
> trials and each time run top, then various systat views, then maybe
> iostat and maybe vmstat.  Watch how each resource is utilized.  Try to
> figure out which resource is being exhausted, if indeed that's the cause
> of the hang.  If you can't identify an obvious thing like RAM or CPU or
> I/O channels then it's probably some less easy to see internal kernel
> resource -- perhaps some table filling up or whatever.  That's where
> kernel debuggers come in very handy.
> 

Well, the system wasn't swapping and still had idle CPU to spare, I can 
say that much! I don't really know what else to look for. I can tell you 
for sure that the system is now *far* more responsive now softdep is 
turned off on all mounts. I'm not able to reach 100 clients with dbench 
without it stalling other things (it doesn't entirely hang the machine), 
but the machine *is* alot less powerful than the test machine, both in 
terms of RAM and CPU. I'm adding another 128MB sometime next week, so 
I'll see if that helps me run any more clients.

-- 
Mark Cullen <mark.r.cullen@gmail.com>