Subject: kern/26908: uvm hang in 2.0_BETA
To: None <gnats-bugs@gnats.NetBSD.org>
From: None <nb-pr@gendalia.org>
List: netbsd-bugs
Date: 09/10/2004 20:00:46
>Number:         26908
>Category:       kern
>Synopsis:       uvm hang in 2.0_BETA
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Sep 11 01:18:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Tracy Di Marco White
>Release:        NetBSD 2.0_BETA
>Organization:
>Environment:
	
	
System: NetBSD draal.ait.iastate.edu 2.0_BETA NetBSD 2.0_BETA (DRAAL) #0: Thu Sep 9 16:47:40 CDT 2004 gendalia@draal.ait.iastate.edu:/usr/obj/2.0/kern/i386/DRAAL i386
Architecture: i386
Machine: i386
>Description:
We've got a perl script that reads about 10MB of network traffic data,
builds it into a hash that ends up being a 4d hash.  We then go through
this and do some basic counting (and build a couple smaller hashes).
It then goes through a small list (0-500) of IP addresses and takes some
action on them, and finally writes it's state info to a file for the
next run.

This runs every 15 minutes, and after upgrading to the latest 2.0_BETA,
it still hangs the machine in anywhere from an hour to a day.  Prior
to the upgrade, the machine was never able to stay up more than a few
hours.  The earliest system we've run this script on is 1.6ZE.  We've
run it on three different machines, with the same problems.

cpu_Debugger(2,1,c0797bc0,ccf5cccc,c03f2c9c) at netbsd:cpu_Debugger+0x4
comintr(c1b4f400,a,10,30,10) at netbsd:comintr+0x68a
Xintr_legacy4() at netbsd:Xintr_legacy4+0xa4
--- interrupt ---
pmap_kenter_pa(c1f17000,5f63000,3,1f17000,c1f07000) at netbsd:pmap_kenter_pa+0x17
uvm_km_kmemalloc1(c0797bc0,0,39000,0,ffffffff) at netbsd:uvm_km_kmemalloc1+0xf0
malloc(39000,c075d360,1,0,40500) at netbsd:malloc+0xd5
amap_alloc1(e0c0,0,1,8118000,cc019dc0) at netbsd:amap_alloc1+0x7e
amap_copy(cc019dc0,ccf2fab0,1,1,8118000) at netbsd:amap_copy+0xd3
uvmfault_amapcopy(ccf5ced4,6,0,0,cc01c948) at netbsd:uvmfault_amapcopy+0xa4
uvm_fault(cc019dc0,8118000,0,1,ccf80214) at netbsd:uvm_fault+0x10d
trap() at netbsd:trap+0x38d
--- trap (number 6) ---
0x805cde8:

This has a matching crash dump and gdb kernel located at
ftp.iastate.edu:/pub/NetBSD/local/netflow/

Other hangs:
db> show uvm
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  255712 VM pages: 164116 active, 82093 inactive, 831 wired, 25 free
  min  10% (25) anon, 10% (25) file, 5% (12) exec
  max  80% (204) anon, 50% (128) file, 30% (76) exec
  pages  77327 anon, 174721 file, 1438 exec
  freemin=64, free-target=85, inactive-target=82069, wired-max=85237
  faults=6136816, traps=9379125, intrs=3211008, ctxswitch=41779665
  softint=4775534, syscalls=17791457, swapins=182, swapouts=193
  fault counts:
    noram=0, noanon=0, pgwait=0, pgrele=0
    ok relocks(total)=3787(3787), anget(retrys)=7723283(0), amapcopy=20728
    neighbor anon/obj pg=54872/850173, gets(lock/unlock)=218413/3787
    cases: anon=4436383, anoncow=30060, obj=212521, prcopy=5892, przero=1443369
  daemon and swap counts:
    woke=16918127, revs=806, scans=718636, obscans=226514, anscans=0
    busy=0, freed=0, reactivate=488997, deactivate=809547
    pageouts=0, pending=0, nswget=0
    nswapdev=1, nanon=751727, nanonneeded=751727 nfreeanon=680987
    swpages=512063, swpginuse=0, swpgonly=0 paging=0
db> t
cpu_Debugger(2,1,c0787c20,ccec3ccc,a) at netbsd:cpu_Debugger+0x4
comintr(c1b48a00,a,10,30,10) at netbsd:comintr+0x6c1
Xintr_legacy4() at netbsd:Xintr_legacy4+0xa4
--- interrupt ---
pmap_kenter_pa(c1ffb000,1b000,3,1ffb000,c1ec7000) at netbsd:pmap_kenter_pa+0x29
uvm_km_kmemalloc1(c0787c20,0,45000,0,ffffffff) at netbsd:uvm_km_kmemalloc1+0xf0
malloc(45000,c074d240,1,c1ec7000,40100) at netbsd:malloc+0xd5
amap_alloc1(11186,0,1,8118000,cc00fa50) at netbsd:amap_alloc1+0x95
amap_copy(cc00fa50,cce13d18,1,1,8118000) at netbsd:amap_copy+0xd3
uvmfault_amapcopy(ccec3ed4,6,0,0,cc0128c4) at netbsd:uvmfault_amapcopy+0xa4
uvm_fault(cc00fa50,8118000,0,1,cc012d68) at netbsd:uvm_fault+0x10d
trap() at netbsd:trap+0x38d
--- trap (number 6) ---
0x805cde8:

>How-To-Repeat:
run flow analysis tools every 15 minutes, wait.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
 source date 20040909