NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/40027: pagedaemon loops on memory shortage



>Number:         40027
>Category:       kern
>Synopsis:       pagedaemon loops on memory shortage
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 25 20:15:00 +0000 2008
>Originator:     Manuel Bouyer
>Release:        NetBSD 5.99.3
>Organization:
>Environment:
System: NetBSD NetBSD 5.99.3 (XEN3PAE_DOMU) #13: Tue Nov 25 16:22:41 CET 2008  
bouyer@rock:/dsk/l1/misc/bouyer/tmp/i386/obj/dsk/l1/misc/bouyer/current/src/sys/arch/i386/compile/XEN3PAE_DOMU
 i386
Architecture: i386
Machine: i386

This also affect 5.0_BETA as of today.

>Description:

test box: i386 with 512M RAM and 128 or 256M swap.
If a user program eats all available ram+swap, the pagedaemon will
eventually enter a loop where it eats 100% of the CPU without doing
anything, though there is freeable memory (in the case below, there are
inactive pages, and pages dedicated to executable or file cache that
could be recycled in this case).

There is no disk activity, and CPU is 100% busy in system.
Here's some collected info from ddb, and snapshots of 'top -s1' and
'systat vm -w1' windows.

ddb entered by cnmagic
Stopped in pid 0.24 (system) at netbsd:breakpoint+0x4:  popl    %ebp
db> sh uvm
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  127451 VM pages: 77010 active, 37610 inactive, 1584 wired, 4 free
  pages  90929 anon, 23202 file, 576 exec
  freemin=256, free-target=341, wired-max=42483
  faults=1568252, traps=1506580, intrs=590339, ctxswitch=1541450
  softint=465451, syscalls=2806949, swapins=110, swapouts=125
  fault counts:
    noram=37088, noanon=6, pgwait=2, pgrele=0
    ok relocks(total)=5158(5161), anget(retrys)=150374(4059), amapcopy=61630
    neighbor anon/obj pg=9019/67148, gets(lock/unlock)=19444/1102
    cases: anon=111886, anoncow=4180, obj=17398, prcopy=2043, przero=1001038
  daemon and swap counts:
    woke=64202, revs=39057, scans=1702660, obscans=575273, anscans=990925
    busy=0, freed=799033, reactivate=2580, deactivate=1648405
    pageouts=163024, pending=215160, nswget=1367
    nswapdev=1, swpgavail=32767
    swpages=32767, swpginuse=32767, swpgonly=32763, paging=0
db> 

ps /l shows that 0.24 is the pagedaemon

top:
load averages:  2.74,  1.89,  0.98                  up 0 days,  0:14   20:22:35
22 processes:  4 runnable, 17 sleeping, 1 on CPU
CPU states:  0.0% user,  0.0% nice,  100% system,  0.0% interrupt,  0.0% idle
Memory: 300M Act, 147M Inact, 6336K Wired, 2304K Exec, 90M File, 20K Free
Swap: 128M Total, 128M Used, 4K Free

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
  470 root      85    0   752K 1184K RUN        0:05  2.85%  2.83% tar
  490 root      39    0   748K  323M RUN        0:01  0.82%  0.73% hang
    0 root     126    0     0K   15M pgdaemon   2:12  0.00%  0.00% [system]

systat:
    3 users    Load  3.63  2.16  1.10                  Tue Nov 25 20:22:54

Proc:r  d  s  w     Csw    Trp    Sys   Int   Sof    Flt      PAGING   SWAPPING
     3     7  1     607     30     13   161   107     30      in  out   in  out
                                                        ops         4    1    1
  97.9% Sy   0.0% Us   0.0% Ni   0.0% In   2.1% Id    pages         4
|    |    |    |    |    |    |    |    |    |    |
=================================================                         forks
                                                                          fkppw
           memory totals (in kB)             161 Interrupts               fksvm
          real  virtual     free                 vcpu0 xencons            pwait
Active  307940   439008       16             153 vcpu0 clock              relck
All     509788   640856       16                 vcpu0 xenbus             rlkok
                                               6 vcpu0 xbd0               noram
Namei         Sys-cache     Proc-cache           vcpu0 xbd1               ndcpy
    Calls     hits    %     hits     %         2 vcpu0 xennet0            fltcp
       17       17  100                                                   zfod
                                                                          cow
Disks:  xbd0  xbd1   md0                                              256 fmin
 seeks                                                                341 ftarg
 xfers     6                                                              itarg
 bytes   71K                                                         1594 wired
 %busy   0.5                                                            4 pdfre



Eventually the memory-hungry program will be killed after some time, or
the box will die completely with "Out of memory allocating ksiginfo
for pid xxx", or it will sit there looping in the pagedaemon.

there are more details in the thread "lookup on memory shortage" on
tech-kern:
http://mail-index.netbsd.org/tech-kern/2008/09/30/msg002948.html

>How-To-Repeat:
        on a system with 512M RAM and 128 or 256M swap (I'm not sure these
values are critical), run a program that will generate dirty pages in the
file cache (e.g. tar cf /root/pkgsrc.tar pkgsrc), and once there are dirty
file cache pages, start a memory-hungry program:
#include <stdio.h>
#include <malloc.h>
#include <string.h>
int main()
{
        int jj = 0;
        int alloc = 1024 * 1024 * 10;
        while(1)
        {
                char *p = malloc(alloc);
                if (p == NULL)
                {
                        fprintf(stderr, "failed\n");
                        sleep(30);
                        exit(1);
                }
                memset(p, jj, alloc);
                if (memcmp(p, p+(alloc/2), alloc/2) != 0)
                {
                        fprintf(stderr, "corrupted memory\n");
                        exit(1);
                }
                fprintf (stderr, "%d ", jj++);
        }
}

>Fix:
        unkown



Home | Main Index | Thread Index | Old Index