tech-kern: analog hits a UVM bug?

Subject: analog hits a UVM bug?
To: None <tech-kern@netbsd.org>
From: Robby Griffin <rmg@MIT.EDU>
List: tech-kern
Date: 10/11/2000 14:59:18
I'm using analog to generate cumulative statistics for a group of sites
at work, but the machine I'm using nearly always drops into the debugger
during the run. I'm wondering whether this is likely to indicate a kernel
bug or whether it's just a "Don't Do That Then" situation.

I don't have an indication of what the panic was with this trace, but in
the past it's been something along the lines of "page already freed".

 db> trace
 _Debugger(3,0,c0e0efc4,d5322e58,c028a518) at _Debugger+0x4
 _panic(c028a492,c0e0efc4,d5322e70,c028ba0b,c0e0efc4) at _panic+0x55
 _uvm_page_lookup_freelist(c0e0efc4,c0e0efc4,d4faf9f0,0,d5322e84) at _uvm_page_lookup_freelist+0x50
 _uvm_pagefree(c0e0efc4) at _uvm_pagefree+0x1c7
 _uvm_anfree(d4faf9f0) at _uvm_anfree+0x6f
 _amap_wipeout(d5131b48,d5324d44,d5324d44,0,d5322ecc) at _amap_wipeout+0x3b
 _amap_unref(d5324d44,0) at _amap_unref+0x1a
 _uvm_unmap_detach(d5324c90,0,d5102dc4,d531300c,d5324c90) at _uvm_unmap_detach+0x31
 _uvm_unmap(d5102dc4,0,bfbfe000,d5102dc4,d5322f2c) at _uvm_unmap+0x5d
 _uvm_deallocate(d5102dc4,0,bfbfe000) at _uvm_deallocate+0x38
 _exit1(d531300c,0,d5322fa8,c029ffde,d531300c) at _exit1+0x12b
 _sys_exit(d531300c,d5322f88,d5322f80,0,ffffffff) at _sys_exit+0x14
 _syscall() at _syscall+0x20e
 --- syscall (number 1) ---
 0x4009f507:
 db> continue
 syncing disks... 10 8 done

 dump to dev 0,1 not possible
 rebooting...

Machine statistics:

 NetBSD khwarizmi 1.4.2 NetBSD 1.4.2 (GENERIC+MAXUSERS256) #1: Wed Jun 28 03:32:21 EDT 2000     root@khwarizmi:/usr/src/sys/arch/i386/compile/GENERIC+MAXUSERS256 i386

 (It really is just GENERIC with maxusers set to 256)

 The machine has 512M ram but only about 240M swap for whatever reason,
 hence the lack of dump.

Application notes:

 I'm running a series of analog processes from a perl script, operating
 on about 2.5G of logs. One run of analog covers all the data, another
 covers all the data from this month, and there are batches of analog
 runs covering individual sites within the data. The crashes do not
 always occur on the same analog process and occasionally the entire run
 is completed without a problem.

 For the cumulative analysis of all logs, the analog process size exceeds
 128M. In the past this resulted in an analog coredump due to my shell
 resource limits. I've been setting the limits higher before doing the
 runs lately, and that's when I've been seeing uvm-related crashes. In
 particular, as an unprivileged user,

                 defaults             crashing

 cputime         unlimited
 filesize        unlimited
 datasize        131072 kbytes        384000
 stacksize       2048 kbytes          16384
 coredumpsize    unlimited
 memoryuse       480924 kbytes
 descriptors     64                   200
 memorylocked    160308 kbytes        256000
 maxproc         80                   200

If this looks like a bug I'll go file a PR, I'm just curious whether this
is an expected or known behavior.

       Thanks,
         Robby