Subject: Re: explaining TOP memory output
To: None <netbsd-users@netbsd.org>
From: Mark Cullen <mark.r.cullen@gmail.com>
List: netbsd-users
Date: 07/15/2006 11:31:47
Michael Parson wrote:
> On Fri, Jul 14, 2006 at 09:21:04AM +0200, Johnny Billquist wrote:
>> Mark Cullen wrote:
>>> Michael Parson wrote:
>>>
>>>> The fact that you have a high-ish load, with an idle CPU is what
>>>> concerns me. You have something that is causing your load queue length
>>>> to be artificially high.
>>>
>>> Interesting, because my system always hovers around a load average of
>>> 1.00, with 100% idle CPU. I was just blaming it on the amount of
>>> processes I had running, but maybe it's not this after all?
>>>
>>> NetBSD 3.0.1, but was the same with 3.0.0
>>>
>>> ---
>>> load averages: 0.94, 0.84, 0.89 08:12:07
>>> 99 processes: 1 runnable, 97 sleeping, 1 on processor
>>> CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100%
>>> idle
>>> Memory: 102M Act, 54M Inact, 1844K Wired, 20M Exec, 67M File, 6132K Free
>>> Swap: 1024M Total, 44M Used, 980M Free
>>> ---
>>>
>>> It's actually quite rare to see it < 1.00. It doesn't seem to really be
>>> causing any problems, but I am not running X. If it's worth looking in
>>> to then I can post any info needed?
>> I believe it's related.
>
> It probably is related.
>
>> If you have lots of page faults, you'll have processes wanting to run,
>> but just waiting for pages to become valid. I would guess these
>> processes counts as active, and thus are included in the load.
>
> How much physical RAM do you have? Looking at your top output, I'm
> guessing around 196M? Does vmstat suggest that you're getting a lot of
> page faults? Problem with that question is that 'a lot' is relative.
> Sometimes solving the problem can be as simple as throwing more RAM
> at the system, but not always. Depends on what the problem is, or if
> there even is a problem. That just might be how your system runs, given
> what's running on it.
>
256MB in there, but I think a whopping 2MB of it is allocated to the
onboard video (which I don't use, but can't disable).
Well, is `systat 1 vmstat` good enough? If so, while it's idling I am
seeing:
---
3 users Load 1.08 1.05 1.05 Fri Jul 14 18:16:59
Proc:r d s w Csw Trp Sys Int Sof Flt PAGING
SWAPPING
17 86 11 13178 339 152 18 in out
in out
---
18 per second is that? Doesn't seem unusually high to me?!
> The load average is one of the most misunderstood and abused metrics of
> unix systems. It quite simply is the length of the run queue averaged
> over a time period ( usually 1, 5, and 15 minute). It is often related
> to, but doesn't always mean, how hard the computer is working. If you
> fire off a couple of compiles at a time, then your load is going to go
> up, and the CPU numbers will reflect this (%sys, %usr, %idle). However,
> you can also have a bunch of zombie processes around that are not really
> using any CPU time, but they still sit in the run queue, causing the
> load average to be high, but the CPU will still be (mostly) idle, since
> when their turn comes up in the queue, they just cycle around w/o doing
> anything.
I know :-)
>
> My box (800 Mhz PIII with 512M of ram and IDE disks) at any given time
> generally has 20-30 interactive shells running (mostly due to myself
> and two other uses using screen), just reading email, irc, etc, plus
> serving DNS for some 93 domains, and serving up virtual domain website
> for 3 or 4 sites, including at least one backed by a mysql database,
> but all of them are very light usage, tends to have a load between .5
> and .6. Incoming email is fairly light, and I run spamassassin, so if
> I get a flurry of mail, that can cause the load to go up, but I have SA
> configured to only scan two messages at a time, so that limits how high
> the load climbs.
>
Well, this is a 1GHz Celeron, 256MB of RAM, 4 IDE disks in two RAID-1
arrays using RAIDFrame and two Intel 100mbit NICS. It's running... all
sorts, apache, samba, mysql, courier-imap, postfix, named etc etc (a
sort of do everything box for the home network) but mostly sits 100%
idle, as I said before :-)
> If my 15 minute load average is >1.00, I know this is out of the
> norm for *my* usual usage patterns, so I start looking into why. Is
> something CPU bound? I/O bound? Memory bound? Spinning and not doing
> anything useful? Top is just one tool, other helpful ones are vmstat,
> iostat, and ps. But like any tool, you need to know how to use them and
> how to read their output.
>
Honestly, I really cant see anything that looks like it might be causing
a constant ~1.00 load average. It's 100% idle with a 1.00 load average,
which I would assume means it's not CPU board. The disks are doing next
to nothing (as far as I can see):
---
(root@bone)/root# iostat -c 10 -w 1 raid0 raid1
tty raid1 raid0 CPU
tin tout KB/t t/s MB/s KB/t t/s MB/s us ni sy in id
17 29 7.12 2 0.02 18.41 1 0.03 1 0 1 1 98
0 184 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 62 0.00 0 0.00 21.00 2 0.00 0 0 3 1 96
0 61 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
40 71 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
72 74 0.00 0 0.00 3.00 4 0.00 0 0 0 0 100
0 62 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 62 0.00 0 0.00 0.00 0 0.00 0 0 0 0 100
0 62 0.00 0 0.00 13.67 12 0.16 0 0 0 0 100
0 62 4.00 2 0.00 0.00 0 0.00 0 0 0 0 100
---
I don't really know about the memory situation. I was never seeing a
constant 1.00 load average on FreeBSD, same machine but only a single
RAID-1 array at that point, and using vinum rather than RAIDFrame.
--
Mark Cullen <mark.r.cullen@gmail.com>