tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: lookup on memory shortage
Hi,
I got a lockup again. I had top running, here's what it displayed before
the box wedged:
|load averages: 2.20, 1.24, 0.98; up 3+14:52:30 03:40:25
|40 processes: 3 runnable, 35 sleeping, 1 zombie, 1 on CPU
|CPU states: 0.0% user, 0.0% nice, 100% system, 0.0% interrupt, 0.0% idle
|Memory: 294M Act, 144M Inact, 12M Wired, 11M Exec, 77M File, 16K Free
|Swap: 256M Total, 256M Used, 4K Free
|
| PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
| 3744 root 43 0 5520K 348K RUN 2:50 1.10% 0.24% cc1
| 1 root 85 0 748K 4K wait 0:49 0.00% 0.00% <init>
| 5951 bouyer 43 0 756K 796K CPU 0:33 0.00% 0.00% top
| 408 bouyer 85 0 764K 412K select 0:12 0.00% 0.00% screen-4.0.3
| 380 bouyer 85 0 824K 656K select 0:08 0.00% 0.00% sshd
| 218 root 85 0 1780K 5672K pause 0:06 0.00% 0.00% ntpd
|11203 root 85 0 748K 4K wait 0:05 0.00% 0.00% <pbulk-build
| 400 bouyer 43 0 756K 12K RUN 0:01 0.00% 0.00% screen-4.0.3
| 339 root 85 0 752K 4K kqueue 0:01 0.00% 0.00% <master>
|24186 root 85 0 748K 4K kqueue 0:01 0.00% 0.00% <tail>
|15925 root 85 0 124K 116K RUN 0:00 0.00% 0.00% sh
|13806 root 85 0 752K 908K piperd 0:00 0.00% 0.00% cron
| 364 root 85 0 756K 172K nanoslp 0:00 0.00% 0.00% getty
| 326 root 85 0 756K 172K nanoslp 0:00 0.00% 0.00% getty
| 395 root 85 0 1156K 4K pause 0:00 0.00% 0.00% <tcsh>
| 375 bouyer 85 0 1084K 4K pause 0:00 0.00% 0.00% <tcsh>
| 194 bouyer 85 0 1020K 4K pause 0:00 0.00% 0.00% <tcsh>
This time I don't understand where the memory has gone, because there's
no big processes running (unless cc1 has grown a lot after the last top
display, and before the box hanging).
I had on console:
|Out of memory allocating ksiginfo for pid 218
|Out of memory allocating ksiginfo for pid 218
|Out of memory allocating ksiginfo for pid 218
|Out of memory allocating ksiginfo for pid 218
|Out of memory allocating ksiginfo for pid 218
And here's some play with ddb:
|db> show uvm
|Current UVM status:
| pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
| 127464 VM pages: 75264 active, 36757 inactive, 3014 wired, 0 free
| pages 91023 anon, 19645 file, 2884 exec
| freemin=256, free-target=341, wired-max=42488
| faults=-522007880, traps=-519495790, intrs=92796461, ctxswitch=525566965
| softint=228469838, syscalls=-1350524482, swapins=2478, swapouts=2516
| fault counts:
| noram=2248, noanon=0, pgwait=68, pgrele=0
| ok relocks(total)=180961(180970), anget(retrys)=735139129(40445),
amapcopy=496101704
| neighbor anon/obj pg=702075445/2109263411,
gets(lock/unlock)=1486862887/140522
| cases: anon=485256283, anoncow=201489739, obj=1273861484,
prcopy=213001397, przero=1523229842
| daemon and swap counts:
| woke=1168338, revs=1167792, scans=32311371, obscans=12587557,
anscans=18608493
| busy=35500, freed=12698798, reactivate=240287, deactivate=22692608
| pageouts=31461, pending=79093, nswget=39981
| nswapdev=1, swpgavail=65535
| swpages=65535, swpginuse=65535, swpgonly=59752, paging=0
|
|db> ps /l
| PID LID S FLAGS STRUCT LWP * NAME WAIT
| 15925 1 2 0 d1766080 sh
| 13806 1 3 80 d1766a00 cron piperd
| 3744 1 2 0 d1767c80 cc1
| 29722 1 3 80 cb44ace0 gcc wait
| 21814 1 3 80 d17667a0 sh wait
| 28972 1 3 80 d17662e0 make wait
| 21184 1 3 80 cd57a360 sh wait
| 18671 1 3 80 cb53bc80 make wait
| 4071 1 3 80 d1766c60 sh wait
| 2134 1 3 80 cd57aa80 make wait
| 21839 1 3 80 cb760340 sh wait
| 16547 1 2 0 cb44aa80 pickup
| 5951 1 2 0 d1766540 top
| 11203 1 3 80 cb7600e0 pbulk-build wait
| 27865 1 3 80 cb760800 sh wait
| 24186 1 3 80 d1661d00 tail kqueue
| 17250 1 3 80 cd57ace0 sh wait
| 194 1 3 80 cb7605a0 tcsh pause
| 395 1 3 80 cb760cc0 tcsh pause
| 413 1 3 80 cb6e00c0 ksh pause
| 401 1 3 80 cb6e0320 tcsh pause
| 408 1 2 0 cb6e0580 screen-4.0.3
| 400 1 2 0 cb6e07e0 screen-4.0.3
| 375 1 3 80 cb6e0a40 tcsh pause
| 380 1 2 0 cb6e0ca0 sshd
| 293 1 3 80 cb53b0a0 sshd netio
| 326 1 2 0 cb53b300 getty
| 318 1 2 0 cb53b560 getty
| 364 1 2 0 ca26a020 getty
| 367 1 2 0 ca271c20 getty
| 360 1 2 4 ca272780 cron
| 351 1 2 0 cb4a4080 qmgr
| 347 1 3 80 cb53ba20 inetd kqueue
| 339 1 2 0 cb53b7c0 master
| 246 1 2 0 cb4a42e0 sshd
| 228 1 3 80 ca2722c0 powerd kqueue
| 218 1 2 1000000 cb4a4540 ntpd
| 109 1 2 0 ca26a280 syslogd
| 1 1 3 80 ca2712a0 init wait
|>0 35 5 204 d1767300 (zombie)
| 31 3 204 cb4a47a0 nfsio nfsiod
| 30 3 204 cb4a4a00 nfsio nfsiod
| 29 3 204 cb4a4c60 nfsio nfsiod
| 28 3 204 ca272060 nfsio nfsiod
| 27 3 204 ca272520 physiod physiod
| 26 3 204 ca2719c0 vmem_rehash vmem_rehash
| 25 3 204 ca271760 aiodoned aiodoned
| 24 2 204 ca271500 ioflush
| > 23 7 204 ca271040 pgdaemon
| 22 3 204 ca26a4e0 cryptoret crypto_wait
| 21 2 204 ca2729e0 xenbus
| 20 3 204 ca272c40 xenwatch evtsq
| 10 3 204 ca26a740 pmfevent pmfevent
| 9 3 204 ca26a9a0 cachegc cachegc
| 8 3 204 ca26ac00 vrele vrele
| 7 3 204 ca267000 xcall/0 xcall
| 6 1 204 ca267260 softser/0
| 5 1 204 ca2674c0 softclk/0
| 4 1 204 ca267720 softbio/0
| 3 1 204 ca267980 softnet/0
| 2 1 205 ca267be0 idle/0
| 1 3 204 c044c080 swapper schedpwait
ddb bt/a isn't of much use:
|db> bt/a ca271040
|trace: pid 0 lid 23 at 0xca797f38
|breakpoint(ffffff00,80,ca797f68,9,1,a,ca797fa8,c03aa6ff,ca77268c,c05bc009) at
ne
|tbsd:breakpoint+0x4
|xencons_tty_input(ca77268c,c05bc009,1,c03a3b9b,3b9aca00,0,6,0,4,2) at
netbsd:xen
|cons_tty_input+0xa6
|xencons_handler(ca77268c,ca79ac0c,0,64,0,4,0,0,c03a1b85,4) at
netbsd:xencons_han
|dler+0x5f
|evtchn_do_event(2,ca79ac0c,ca79abc4,0,fatal page fault in supervisor mode
|trap type 6 code 0 eip c038e6d1 cs 9 eflags 10246 cr2 ca798000 ilevel 8
|kernel: supervisor trap page fault, code=0
Here's a ps/a, and the 'show map' for all processes:
| PID COMMAND STRUCT PROC * UAREA * VMSPACE/VM_MAP
| 15925 sh cd578528 cb346da0 cb6f30d4
| 13806 cron cd578bf8 cb28bda0 ca278750
| 3744 cc1 d0733aa8 cb2d5da0 d166f008
| 29722 gcc cc32ca38 cb242da0 d166fea8
| 21814 sh cb539da4 cbc4dda0 cb6f3c34
| 28972 make cb6eb6d8 cbb0eda0 d166f348
| 21184 sh d0733070 cb312da0 cb6f3414
| 18671 make cd578890 cb246da0 d166f0d8
| 4071 sh cd57800c cb4dada0 d166fa98
| 2134 make d0733e10 cbc4ada0 d166fdd8
| 21839 sh cb6eb1bc cd8b2da0 cb6f3684
| 16547 pickup cc32cda0 cb56eda0 cb6f3a94
| 5951 top d17658d0 d0502da0 d166f828
| 11203 pbulk-buil cb6eb008 cd8c2da0 cb6f3754
| 27865 sh cb6eb524 cb792da0 cb6f3b64
| 24186 tail cd578374 d1582da0 cb6f3344
| 17250 sh cd578dac cd8c5da0 cb6f38f4
| 194 tcsh cb6eb370 cb79ada0 cb6f39c4
| 395 tcsh cb6eb88c cb75ada0 cb6f3d04
| 413 ksh cb6eba40 cb752da0 cb6f3dd4
| 401 tcsh cb6ebbf4 cb71eda0 ca278000
| 408 screen-4.0 cb6ebda8 cb715da0 cb6f3ea4
| 400 screen-4.0 cb539004 cb70bda0 ca2780d0
| 375 tcsh cb5391b8 cb6d7da0 ca2781a0
| 380 sshd cb53936c cb6ceda0 ca278270
| 293 sshd cb539520 cb6c6da0 ca278340
| 326 getty cb5396d4 cb6bada0 ca278410
| 318 getty cb539888 cb583da0 ca2784e0
| 364 getty ca273a38 ca8a3da0 ca278d00
| 367 getty ca273bec ca8a6da0 ca278dd0
| 360 cron ca2736d0 cb2cada0 ca278b60
| 351 qmgr ca273000 cb562da0 ca278820
| 347 inetd cb539bf0 cb57bda0 ca278680
| 339 master cb539a3c cb587da0 ca2785b0
| 246 sshd ca2731b4 cb51bda0 ca2788f0
| 228 powerd ca27351c cb2fada0 ca278a90
| 218 ntpd ca273368 cb51eda0 ca2789c0
| 109 syslogd ca273884 ca8a0da0 ca278c30
| 1 init ca273da0 ca8b2da0 ca278ea0
|>0 system c044bec0 cb404da0 c04aa6c0
|
|db> sh map cb6f30d4
|MAP 0xcb6f30d4: [0x0->0xbf800000]
| #ent=18, sz=68423680, ref=1, version=192, flags=0x41
| pmap=0xca27945c(resident=1, wired=0)
|db> sh map ca278750
|MAP 0xca278750: [0x0->0xbf800000]
| #ent=17, sz=70053888, ref=1, version=12, flags=0x41
| pmap=0xca27907c(resident=1, wired=0)
|db> sh map d166f008
|MAP 0xd166f008: [0x0->0xbf800000]
| #ent=983, sz=663093248, ref=1, version=51374, flags=0x41
| pmap=0xd12ff9b4(resident=1, wired=0)
|db> sh map d166fea8
|MAP 0xd166fea8: [0x0->0xbf800000]
| #ent=12, sz=69980160, ref=1, version=69, flags=0x41
| pmap=0xd12ffb28(resident=1, wired=0)
|db> sh map cb6f3c34
|MAP 0xcb6f3c34: [0x0->0xbf800000]
| #ent=19, sz=70107136, ref=1, version=200, flags=0x41
| pmap=0xd12ff8bc(resident=1, wired=0)
|db> sh map d166f348
|MAP 0xd166f348: [0x0->0xbf800000]
| #ent=16, sz=70053888, ref=1, version=64, flags=0x41
| pmap=0xca279364(resident=1, wired=0)
|db> sh map cb6f3414
|MAP 0xcb6f3414: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=55, flags=0x41
| pmap=0xca279aa8(resident=1, wired=0)
|db> sh map d166f0d8
|MAP 0xd166f0d8: [0x0->0xbf800000]
| #ent=21, sz=72151040, ref=1, version=71, flags=0x41
| pmap=0xd12ffd18(resident=1, wired=0)
|db> sh map d166fa98
|MAP 0xd166fa98: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=43, flags=0x41
| pmap=0xd12ff5d4(resident=1, wired=0)
|db> sh map d166fdd8
|MAP 0xd166fdd8: [0x0->0xbf800000]
| #ent=21, sz=72151040, ref=1, version=44, flags=0x41
| pmap=0xd12ff2ec(resident=1, wired=0)
|db> sh map cb6f3684
|MAP 0xcb6f3684: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=44, flags=0x41
| pmap=0xca2790f8(resident=1, wired=0)
|db> sh map cb6f3684
|MAP 0xcb6f3684: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=44, flags=0x41
| pmap=0xca2790f8(resident=1, wired=0)
|db> sh map cb6f3a94
|MAP 0xcb6f3a94: [0x0->0xbf800000]
| #ent=26, sz=72003584, ref=1, version=221, flags=0x41
| pmap=0xca279000(resident=1, wired=0)
|db> sh map d166f828
|MAP 0xd166f828: [0x0->0xbf800000]
| #ent=20, sz=70127616, ref=1, version=214, flags=0x41
| pmap=0xd12fff08(resident=1, wired=0)
|db> sh map cb6f3754
|MAP 0xcb6f3754: [0x0->0xbf800000]
| #ent=23, sz=75304960, ref=1, version=305, flags=0x41
| pmap=0xca279174(resident=1, wired=0)
|db> sh map cb6f3b64
|MAP 0xcb6f3b64: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=197, flags=0x41
| pmap=0xca2793e0(resident=1, wired=0)
|db> sh map cb6f3344
|MAP 0xcb6f3344: [0x0->0xbf800000]
| #ent=13, sz=69980160, ref=1, version=201, flags=0x41
| pmap=0xd12ffe8c(resident=1, wired=0)
|db> sh map cb6f38f4
|MAP 0xcb6f38f4: [0x0->0xbf800000]
| #ent=20, sz=70107136, ref=1, version=204, flags=0x41
| pmap=0xca27926c(resident=1, wired=0)
|db> sh map cb6f39c4
|MAP 0xcb6f39c4: [0x0->0xbf800000]
| #ent=26, sz=69271552, ref=1, version=270, flags=0x41
| pmap=0xca2792e8(resident=1, wired=0)
|db> sh map cb6f3d04
|MAP 0xcb6f3d04: [0x0->0xbf800000]
| #ent=26, sz=69414912, ref=1, version=430, flags=0x41
| pmap=0xca2794d8(resident=1, wired=0)
|db> sh map cb6f3dd4
|MAP 0xcb6f3dd4: [0x0->0xbf800000]
| #ent=13, sz=69980160, ref=1, version=348, flags=0x41
| pmap=0xca279554(resident=1, wired=0)
|db> sh map ca278000
|MAP 0xca278000: [0x0->0xbf800000]
| #ent=26, sz=69259264, ref=1, version=267, flags=0x41
| pmap=0xca27964c(resident=1, wired=0)
|db> sh map cb6f3ea4
|MAP 0xcb6f3ea4: [0x0->0xbf800000]
| #ent=23, sz=70107136, ref=1, version=190, flags=0x41
| pmap=0xca2795d0(resident=1, wired=0)
|db> sh map ca2780d0
|MAP 0xca2780d0: [0x0->0xbf800000]
| #ent=23, sz=70107136, ref=1, version=237, flags=0x41
| pmap=0xca2796c8(resident=1, wired=0)
|db> sh map ca2781a0
|MAP 0xca2781a0: [0x0->0xbf800000]
| #ent=28, sz=69341184, ref=1, version=314, flags=0x41
| pmap=0xca279744(resident=1, wired=0)
|db> sh map ca278270
|MAP 0xca278270: [0x0->0xbf800000]
| #ent=69, sz=75927552, ref=1, version=192, flags=0x41
| pmap=0xca2797c0(resident=1, wired=0)
|db> sh map ca278340
|MAP 0xca278340: [0x0->0xbf800000]
| #ent=69, sz=75927552, ref=1, version=246, flags=0x41
| pmap=0xca27983c(resident=1, wired=0)
|db> sh map ca278410
|MAP 0xca278410: [0x0->0xbf800000]
| #ent=19, sz=70070272, ref=1, version=207, flags=0x41
| pmap=0xca2798b8(resident=1, wired=0)
|db> sh map ca2784e0
|MAP 0xca2784e0: [0x0->0xbf800000]
| #ent=19, sz=70070272, ref=1, version=210, flags=0x41
| pmap=0xca279934(resident=1, wired=0)
|db> sh map ca278d00
|MAP 0xca278d00: [0x0->0xbf800000]
| #ent=19, sz=70070272, ref=1, version=207, flags=0x41
| pmap=0xca279e0c(resident=1, wired=0)
|db> sh map ca278dd0
|MAP 0xca278dd0: [0x0->0xbf800000]
| #ent=19, sz=70070272, ref=1, version=96, flags=0x41
| pmap=0xca279e88(resident=1, wired=0)
|db> sh map ca278b60
|MAP 0xca278b60: [0x0->0xbf800000]
| #ent=17, sz=70053888, ref=1, version=3675, flags=0x41
| pmap=0xca279d14(resident=106, wired=0)
|db> sh map ca278820
|MAP 0xca278820: [0x0->0xbf800000]
| #ent=26, sz=72003584, ref=1, version=211, flags=0x41
| pmap=0xca279b24(resident=1, wired=0)
|db> sh map ca278680
|MAP 0xca278680: [0x0->0xbf800000]
| #ent=22, sz=70131712, ref=1, version=12, flags=0x41
| pmap=0xca279a2c(resident=1, wired=0)
|db> sh map ca2785b0
|MAP 0xca2785b0: [0x0->0xbf800000]
| #ent=26, sz=72003584, ref=1, version=640, flags=0x41
| pmap=0xca2799b0(resident=1, wired=0)
|db> sh map ca2788f0
|MAP 0xca2788f0: [0x0->0xbf800000]
| #ent=50, sz=73154560, ref=1, version=5440, flags=0x41
| pmap=0xca279ba0(resident=1, wired=0)
|db> sh map ca278a90
|MAP 0xca278a90: [0x0->0xbf800000]
| #ent=19, sz=70111232, ref=1, version=11, flags=0x41
| pmap=0xca279c98(resident=1, wired=0)
|db> sh map ca2789c0
|MAP 0xca2789c0: [0x0->0xbf800000]
| #ent=26, sz=72691712, ref=1, version=108, flags=0x45
| pmap=0xca279c1c(resident=1418, wired=1413)
|db> sh map ca278c30
|MAP 0xca278c30: [0x0->0xbf800000]
| #ent=19, sz=70082560, ref=1, version=172, flags=0x41
| pmap=0xca279d90(resident=1, wired=0)
|db> sh map ca278ea0
|MAP 0xca278ea0: [0x0->0xbf800000]
| #ent=20, sz=70090752, ref=1, version=77, flags=0x41
| pmap=0xca279f04(resident=1, wired=0)
|db> sh map c04aa6c0
|MAP 0xc04aa6c0: [0x0->0xbfdfc000]
| #ent=0, sz=0, ref=1, version=1, flags=0x41
| pmap=0xc04ce780(resident=2174, wired=1629)
The map for cc1 looks large but I'm not sure it explains where the
memory did go. I wonder if there's some memory leak in the kernel that
only show up with certain usage patterns (the box can build packages for
days with almost no swap in use so it's not a slow memory leak).
Home |
Main Index |
Thread Index |
Old Index