tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

lookup on memory shortage



Hi,
I'm seeing lookups on -current systems doing pkgsrc pbulk. It usually
happens several times in a pbulk run. The lookup means no more process
scheduling, an no reply to ping.
Below is the console output, and some infos I could collect from ddb
(the "fatal breakpoint trap in supervisor mode" is because I entered the ddb
magic sequence on console).

Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
Out of memory allocating ksiginfo for pid 232
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip ffffffff8036ef35 cs e030 rflags 202 cr2  7f7ffd10fc00 
cpl 6 rsp ffffa000248ada48
db> tr
breakpoint() at netbsd:breakpoint+0x5
xencons_tty_input() at netbsd:xencons_tty_input+0xc5
xencons_handler() at netbsd:xencons_handler+0x96
evtchn_do_event() at netbsd:evtchn_do_event+0xfa
call_evtchn_do_event() at netbsd:call_evtchn_do_event+0xd
hypervisor_callback() at netbsd:hypervisor_callback+0xa3
pool_cache_get_paddr() at netbsd:pool_cache_get_paddr+0xb1
pmap_enter_ma() at netbsd:pmap_enter_ma+0x196
pmap_enter() at netbsd:pmap_enter+0x5a
uvm_fault_internal() at netbsd:uvm_fault_internal+0x971
trap() at netbsd:trap+0x805
--- trap (number 0) ---
0:
db> ps
 PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND    WAIT
 2402         25028     2892          0 2  0x4000    1          cc1plus
 25028         1697     2892          0 2  0x4000    1              g++    wait
 1697          2126     2892          0 2  0x4000    1               sh    wait
 2126         22448     2892          0 2  0x4000    1               sh    wait
 22448         9362     2892          0 2  0x4000    1               sh    wait
 9362         22231     2892          0 2  0x4000    1            gmake    wait
 22231          582     2892          0 2  0x4000    1            gmake    wait
 582          15724     2892          0 2  0x4000    1               sh    wait
 15724         5667     2892          0 2  0x4000    1            gmake    wait
 5667          7416     2892          0 2  0x4000    1               sh    wait
 7416          5434     2892          0 2  0x4000    1            gmake    wait
 5434         12811     2892          0 2  0x4000    1               sh    wait
 12811        14503     2892          0 2  0x4000    1            gmake    wait
 14503         4086     2892          0 2  0x4000    1               sh    wait
 4086         26139     2892          0 2  0x4000    1            gmake    wait
 26139         5639     2892          0 2  0x4000    1            gmake    wait
 5639         25941     2892          0 2  0x4000    1               sh    wait
 25941        24586     2892          0 2  0x4000    1             make    wait
 24586         4937     2892          0 2  0x4000    1               sh    wait
 4937          4953     2892          0 2  0x4000    1             make    wait
 4953         17989     2892          0 2  0x4000    1               sh    wait
 3558           337      337         12 2  0x4100    1           pickup
 3535          6335     3535       1000 2  0x4100    1     screen-4.0.3
 6335         27932     6335       1000 2  0x4000    1             tcsh   pause
 27932         2643     2643       1000 2   0x100    1             sshd  select
 2643           245     2643          0 2  0x4100    1             sshd   netio
 13261         9469    13261       1000 2  0x4000    1              top
 9469         12936     9469       1000 2  0x4000    1             tcsh   pause
 17989        11552     2892          0 2  0x4000    1      pbulk-build    wait
 11552         2892     2892          0 2  0x4000    1               sh    wait
 17040        13093    17040          0 2  0x4000    1             tail  kqueue
 2892             1     2892          0 2  0x4000    1               sh    wait
 13093        14587    13093          0 2  0x4000    1             tcsh   pause
 14587        13782    14587          0 2  0x4000    1              ksh   pause
 13782        12936    13782       1000 2  0x4000    1             tcsh   pause
 12936        13758    12936       1000 2   0x101    1     screen-4.0.3  select
 13758          370    13758       1000 2  0x4100    1     screen-4.0.3
 370            367      370       1000 2  0x4000    1             tcsh   pause
 367            282      282       1000 2   0x100    1             sshd  select
 282            245      282          0 2  0x4101    1             sshd   netio
 362              1        1          0 2  0x4000    1            getty
 313              1        1          0 2  0x4000    1            getty
 361              1        1          0 2  0x4000    1            getty
 371              1      371          0 2  0x4000    1            getty
 356              1      356          0 2       0    1             cron
 350              1      350          0 2       0    1            inetd  kqueue
 344            337      337         12 2  0x4100    1             qmgr
 337              1      337          0 2  0x4100    1           master
 245              1      245          0 2       0    1             sshd  select
 221              1      221          0 2       0    1           powerd  kqueue
>232              1      232          0 2       0    1             ntpd
 109              1      109          0 2       0    1          syslogd
 1                0        1          0 2  0x4001    1             init    wait
 0               -1        0          0 2 0x20002   23           system       *
db> show uvmexp
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  125114 VM pages: 70679 active, 34527 inactive, 1520 wired, 9 free
  pages  84819 anon, 18159 file, 3748 exec
  freemin=256, free-target=341, wired-max=41704
  faults=1230798242, traps=1231391590, intrs=19787899, ctxswitch=69011216
  softint=27506896, syscalls=658831806, swapins=490, swapouts=530
  fault counts:
    noram=18668, noanon=0, pgwait=8, pgrele=0
    ok relocks(total)=176116(176124), anget(retrys)=210791151(137139), amapcopy=
126345300
    neighbor anon/obj pg=330698213/1683147083, gets(lock/unlock)=418364813/38984

    cases: anon=153914740, anoncow=51459898, obj=342346991, prcopy=76017815, prz
ero=596820944
  daemon and swap counts:
    woke=32248, revs=13625, scans=3718773, obscans=2574210, anscans=511975
    busy=4666, freed=2786331, reactivate=248373, deactivate=5131980
    pageouts=243529, pending=88830, nswget=136855
    nswapdev=1, swpgavail=65535
    swpages=65535, swpginuse=65535, swpgonly=56406, paging=0


This raises several questions. First, I've trouble parsing the
swap section of the show uvmexp: what do swpgavail, swpages and
swpginuse means ? If my swap full ?
Second, there are only 9 free pages, but 18159 allocated to file
cache. Why couldn't the pagedaemin free pages from the file
cache ?

Is there something more I can do next times this happens ?


-- 
Manuel Bouyer, LIP6, Universite Paris VI.           
Manuel.Bouyer%lip6.fr@localhost
     NetBSD: 26 ans d'experience feront toujours la difference
--


Home | Main Index | Thread Index | Old Index