port-vax: Problem with -current from 4.99.20...

Subject: Problem with -current from 4.99.20...
To: VAX porting list <port-vax@netbsd.org>
From: Johnny Billquist <bqt@softjar.se>
List: port-vax
Date: 07/24/2007 14:36:51
I'm trying to figure out how NetBSD broke when 4.99.20 were introduced, 
but haven't really come that far, nor have I had much time.
But I thought I'd pop the questions here, and maybe someone else with 
more knowledge can look at it, and perhaps give some feedback and ideas.

What happens is that the system runs fine for quite a while. My test 
scenario involves running build.sh, which will cause the system to crash 
after several hours. So it's not quick to reproduce, nor easy to isolate 
from that point of view.
The hardware is a VAX 4000/90 with 128 megs of memory.

 From observations, it seems that the problems come when I run out of 
memory, and start allocating swap space.

I have the machine crashed in ddb right now, and here are some relevant 
info:

-----------------------------------------------------------------
login: panic: Segv in kernel mode: pc 801b00f6 addr 4
Stopped in pid 0.4 (system) at  netbsd:trap+0x4fc:      movl    $1, -64(fp)
db> bt
panic: Segv in kernel mode: pc %x addr %x
Stack traceback :
0x8c01bd34: trap+0x4fc(0x8c01bdfc)
0x8c01bdfc: trap type=0x8 code=0x4 pc=0x801b00f6 psl=0x4
0x8c01bdc8: pmap_deactivate+0x1a(0x84fab100)
0x8c01be4c: cpu_swapout+0x1f(0x84fab100)
0x8c01be6c: uvm_swapout+0x50(0x84fab100)
0x8c01be90: uvm_swapout_threads+0x114(void)
0x8c01bec8: uvm_pageout+0x216(0x87f41820)
0x8c01bf64: cpu_lwp_bootstrap+0x15(0)
db> ps
  PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND 
    WAIT
  19923          363      442          0 2  0x4000    1              cc1
  363           6755      442          0 2  0x4000    1               cc 
    wait
  6755          3067      442          0 2  0x4000    1          nbgmake 
    wait
  3067         21197      442          0 2  0x4000    1               sh 
    wait
  23358          388      388         12 2  0x4100    1           pickup 
  select
  21197        19374      442          0 2  0x4000    1          nbgmake 
    wait
  19374        12938      442          0 2  0x4000    1               sh 
    wait
  12938         3459      442          0 2  0x4000    1           nbmake 
    wait
  3459          8751      442          0 2  0x4000    1               sh 
    wait
  8751         10418      442          0 2  0x4000    1           nbmake 
    wait
  10418        24288      442          0 2  0x4000    1               sh 
    wait
  24288        17653      442          0 2  0x4000    1           nbmake 
    wait
  17653        26565      442          0 2  0x4000    1               sh 
    wait
  26565        24824      442          0 2  0x4000    1           nbmake 
    wait
  24824         1895      442          0 2  0x4000    1               sh 
    wait
  1895           442      442          0 2  0x4000    1           nbmake 
    wait
  66             426       66          0 2  0x4000    1             tail 
  kqread
  442            426      442          0 2  0x4000    1               sh 
    wait
  426            404      426          0 2  0x4000    1             tcsh 
   pause
  404            400      404       2026 2  0x4000    1             tcsh 
   pause
  400            412      412       2026 2   0x100    1             sshd 
  select
  412            268      412          0 2  0x4101    1             sshd 
   netio
  413              1      413          0 2  0x4000    1            getty 
   ttyin
  409            388      388         12 2  0x4100    1             qmgr 
  select
  402              1      402          0 2       0    1             cron 
nanoslp
  408              1      408          0 2       0    1            inetd 
  kqread
  388              1      388          0 2  0x4100    1           master 
  select
  268              1      268          0 2       0    1             sshd 
  select
  277              1      277          0 2       0    1            rwhod 
  select
  278              1      278          0 2       0    1             ntpd 
   pause
  169              1      169          0 2       0    1           ypbind 
  select
  168              1      168          0 2       0    1          rpcbind 
  select
  102              1      102          0 2       0    1          syslogd 
  kqread
  1                0        1          0 2  0x4001    1             init 
    wait
 >0               -1        0          0 2 0x20002   13           system 
       *
db>
--------------------------------------------------

I find it a bit interesting that it looks like it's more or less 
directly after a fork that this happens, that it somehow gets into 
uvm_pageout (I haven't figured out how it gets there), and that process 
0 is active.
Is some kind of kernel thread created here? Or maybe activated for the 
first time?

Anyone who knows more about these innards?

	Johnny

-- 
Johnny Billquist                  || "I'm on a bus
                                   ||  on a psychedelic trip
email: bqt@softjar.se             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol