Subject: Re: Bad response...
To: None <current-users@netbsd.org>
From: Julio M. Merino Vidal <jmmv@menta.net>
List: current-users
Date: 08/31/2004 15:23:42
This is not a reply to this exact message, but to the whole thread (but I deleted all previous
ones, so can't reply to them).

I've also seen this, and IIRC, it started this January more or less.  For example, my server
has 80 mb of ram, and it is quite loaded.  It performed very well with NetBSD 1.6, and it
hardly ever touched the swap.  However, just after updating it to NetBSD 2.0_BETA (when
the branch was first cut), running the same services and the same versions of all packages,
it started swapping.  ATM, I see it has used 10 mb of swap (not so bad, and I can accept
it because of its load).

However, my workstation exposes the problem more visibly.  It's an Athlon XP 2600+ with
512 mb of ram, running a very recent -current.  Building a release and using GNOME or
KDE in the meantime makes it swap a lot (specially if you go away and leave the machine
unused for several minutes), up to 120 mb or so (it was worse at the beginning of the year).

Although there have been improvements during the year, I can't say I'm really happy with
the current behavior.  I don't think I'm short on RAM for what I do, and I'm afraid that even
if I buy some more (which I will do, if it's true that I'm short on RAM), it will still swap.

We need some improvements... but sorry, I can't make suggestions in this area :-/

Cheers


On Tue, 31 Aug 2004 23:00:26 +1000
Simon Burge <simonb@wasabisystems.com> wrote:

> Johnny Billquist wrote:
> 
> > On Tue, 31 Aug 2004, Michael van Elst wrote:
> > 
> > > [ ... ]
> > >
> > > A limit to the dirty pages in the filecache will help for all these
> > > cases.
> > 
> > Good points, but I know they aren't applicable in my case.
> 
> Just to echo Johnny's point, writes aren't necessarily the problem.
> 
> Here's a simple test case:  I've got mpd playing an mp3, and starting a
> large read results in the "bad response" scenario, most noticably with
> the song mpd is playing pausing a lot.  I've also got a modified vmstat
> that reports when the time since it last updated was too long ago - you
> can see that sometimes it doesn't get scheduled when it should be.  This
> is on a Athlon 2000+ with 512MB running 2.0BETA from late May.
> 
> Here we have mpd playing, and the system more or less otherwise idle.  I
> say "more or less", since top says a running galeon always uses anywhere
> between 2% and 10% cpu.  A "vmstat 1" is running.  About 10 seconds in
> I start wc on a 110ishMB file, after which is an obviously idle period
> again.
> 
>  procs     memory     page                       disks    faults    cpu
>  r b w     avm   fre  flt    re pi  po  fr    sr wd2 wd3  in   sy   cs us sy id
>  1 1 0 1434448 33800  147     0  0   0   0     0   0   0 272 2592  616 17  2 81
>  0 1 0 1434524 33736   59     0  0   0   0     0   0   0 276 2313  542 13  0 87
>  0 1 0 1434580 33656  170     0  1   0   0     0   0   0 276 2583  610 18  2 80
>  1 1 0 1434580 33656   50     0  0   0   0     0   0   0 274 2606  635  3  1 96
>  1 1 0 1434568 33656  151     0  0   0   0     0   0   0 273 2509  592 15  3 82
>  1 1 0 1434568 33656   54     0  0   0   0     0   0   0 273 2448  598 12  0 88
>  1 1 0 1434652 33584  149     0  0   0   0     0   9   9 362 3103  705 17  5 78
>  0 1 0 1434652 33584   57     0  0   0   0     0   0   0 402 2800  593 20  1 79
>  1 1 0 1434652 33584  155     0  0   0   0     0   0   0 417 3402  682 16  2 82
>  1 1 0 1434716 33520   63     0  0   0   0     0   0   0 334 2920  650 14  2 84
>  2 1 0 1443292 24844 1255     0  1   0   0     0 133 141 549 1789 1371 84  4 12
>  2 1 0 1450392 17756  939     0  0   0   0     0 116 124 506 2197 1381 93  7  0
>  2 1 0 1457188 10972  996     0  0   0   0     0 110 107 488 2733 1479 91  9  0
>  1 1 0 1464420  3804  949     0  0   0   0     0 124 132 523 2674 1565 91  7  2
>  2 1 0 1469492   720 1100     0  0   0   0  1037  65 104 436 2659 1322 91  7  2
>  2 1 0 1469432   720 1060     0  0   0   0  2017 124 119 506 2610 1547 85 12  3
>  2 1 0 1469432   784 1143     4  0   0   0  2020 131 114 508 2867 1609 88 10  2
> delay from last update is 2.028 seconds
> 10 1 0 1468860   680  851 17694  0 300   6 24659  52  54 755 2448 1260 36 63  0
>  3 1 0 1468908   880  256 16210  1 264   6 17513  39  40 434 1136  406 18 82  0
>  4 1 0 1469180  1360  418  5982  0 194   8  8315  71  74 423 1615  635 34 65  1
>  2 1 0 1470084   400  900   205  0   0   0  1643 103 100 479 2798 1436 81  9 11
>  1 1 0 1469276  1168 1087    24  0   0   0  2014 114 112 493 2776 1544 91  8  1
>  3 1 0 1469576   976  799 19265  0  37   1 20960  98  97 460 2313 1233 75 24  1
>  1 1 0 1469612  1292  367 19966  0 212 121 20751  57  55 376 1416  518 39 61  0
>  3 2 0 1470112  1344  324  9121 22 346 121 14998  59  67 494 1180  600 25 74  2
>  2 1 0 1470072  1356  891     0 55   9   3  1434  92 139 497 2589 1447 81 15  4
>  3 1 0 1471064  1460  775    20 10 275   1  1521 123 121 493 2143 1255 69 28  3
>  2 1 0 1472968  1112  590   186 28 616  40  1080  86  78 590 1840  901 37 63  0
>  1 3 1 1473472   792  231   191 27 222  56   469  63  76 387  491  433 14 80  6
>  4 2 0 1472980  1000  224   460 28 233  55   748  35  47 353  229  223  7 90  3
>  4 2 0 1473000   804  173   202 15 218  65   485  42  44 379  682  333 20 79  1
>  5 2 0 1472940   936  316   206 49 230  55   491  76  58 422  768  500 17 81  2
>  5 1 0 1472936   800  177   190 16 236  50   476  47  30 366  933  350 24 76  0
>  7 1 0 1472928  1060  259   284  3 219  72   596  50  38 341 1011  343 21 78  1
>  4 2 0 1473960   296  298   199 25 208  70   477  61  64 386  586  472 18 74  8
> delay from last update is 1.623 seconds
>  9 1 0 1472968  1164  265   345 11 385 167   913  36  36 539  554  321  9 88  2
>  5 1 0 1472968   932  297   412 24 330 231   981  47  48 521  618  468 15 80  5
>  0 3 0 1473828   320  436   175 27 205  75   463  90  87 421  484  538 22 71  8
>  0 2 0 1473324  1500  206   156 73 178  91   425  58  61 384  496  498  0 60 40
>  2 1 0 1473628  1188  236     0 78   0   0     0  40  37 351 2104  774 38  4 58
>  2 1 0 1473900   916  115     0 52   0   0     0  29  21 326 2364  781 27  3 70
>  1 1 0 1473788  1056  188     0 29   0   0     0   9  18 297 2500  693 22  6 72
>  1 2 0 1474020   720  251     0 56   0   0     0  35  18 417 2985  851 10  1 89
>  1 1 0 1474292   464  216     0 64   0   0     0  25  38 466 2970  861 18  4 78
>  1 1 0 1474336   420   63     0 11   0   0     0   5   4 432 2772  663 17  7 76
>  3 2 0 1472940  1392  133   178  6 131 109   426  17  20 356 1491  343 12 49 39
>  0 1 0 1473200  1136  119     0 64   0   0     0  16  47 362 2616  787 23  1 76
>  0 1 0 1473288  1048  152     0  6   0   0     0   3   9 326 2802  672 17  3 80
>  2 1 0 1473392   968  115     0 11   0   0     0   4   7 378 2737  671 16  3 81
>  1 1 0 1473416   948  146     0  5   0   0     0   2   1 319 2574  623 20  1 79
>  1 1 0 1473564   808   73     0 18   0   0     0   7   9 293 2564  685 13  0 87
>  1 1 0 1473640   732  153     0 19   0   0     0   6  11 292 2618  674  9  3 88
>  2 1 0 1473672   692   56     0 10   0   0     0   3   5 285 2532  666  4  4 92
>  1 1 0 1473804   568  148     0 15   0   0     0   6   7 290 2685  710  6  2 92
>  1 2 0 1473844   524   53     0 11   0   0     0   2   6 314 2696  682  9  0 91
> 
> This is with
> 	vm.anonmin = 50
> 	vm.anonmax = 90
> 	vm.execmin = 5
> 	vm.execmax = 90
> 	vm.filemin = 5
> 	vm.filemax = 10
> 
> I've tried other combinations (10,80,15,25,10,15  10,18,15,40,5,15
> 20,80,15,40,5,15) without any real success.  Playing with vm.bufcache
> and vm.filemax also doesn't seem to have any noticable difference.
> 
> Regularly starting largish processes (like spamprobe) also have a
> similar "bad response" effect with mpd often pausing.
> 
> Also, when I've noticed that when this box is in this "bad response"
> state I get "keyboard bounce" while in X.  Quite often, a single
> character repeats, so that at normal typing speed "youud see
> someethingg  likke this".
> 
> Simon.
> --
> Simon Burge                            <simonb@wasabisystems.com>
> NetBSD Support and Service:         http://www.wasabisystems.com/
> 


-- 
Julio M. Merino Vidal <jmmv@menta.net>
http://www.livejournal.com/users/jmmv/
The NetBSD Project - http://www.NetBSD.org/