Subject: port-vax/12520: "panic: kernel stack invalid" with current on VAX
To: None <gnats-bugs@gnats.netbsd.org>
From: None <Thilo.Manske@HEH.Uni-Oldenburg.DE>
List: netbsd-bugs
Date: 04/01/2001 13:49:56
>Number: 12520
>Category: port-vax
>Synopsis: I get easily "panic: kernel stack invalid"
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-vax-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 01 04:50:00 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator: Thilo Manske
>Release: current since ~January(?), sorry, I don't remember.
>Organization:
Dies ist Thilos Unix Signature! Viel Spass damit.
>Environment:
System:
Kernels: GENERIC or custom Kernels
System 1: VT1300 (~VS 3100m30 without disc controller), 8MB RAM diskless
System 2: VS 4000/VLC with 16MB RAM, it happens when running diskless
or when running completly "net-less".
Compilers and userland (still) from the 1.5 release.
Architecture: vax
Machine: vax
Boot message from System 1 (with GENERIC kernel made from current sources):
VAXstation 3100/m{30,40}
cpu: KA41/42
cpu: Enabling primary cache, secondary cache
total memory = 8076 KB
avail memory = 4816 KB
using 126 buffers containing 504 KB of memory
mainbus0 (root)
vsbus0 at mainbus0
vsbus0: interrupt mask 8
le0 at vsbus0 csr 0x200e0000 vec 120 ipl 14 maskbit 5 buf 0x33d000-0x34cfff
le0: address 08:00:2b:16:d8:1c
le0: 32 receive buffers, 8 transmit buffers
dz0 at vsbus0 csr 0x200a0000 vec 304 ipl 14 maskbit 6
dz0: 4 lines
lkkbd0 at dz0
wskbd0 at lkkbd0
lkms0 at dz0
wsmouse0 at lkms0
smg0 at vsbus0 csr 0x200f0000 vec 104 ipl 14 maskbit 3
wsdisplay0 at smg0
wsdisplay0: screen 0-7 added (128x57, vt100 emulation)
boot device: le0
root on le0
Boot message from System 2 (with GENERIC kernel made from current sources):
MicroVAX 3100/m{30,40}
cpu: KA48
cpu: turning on floating point chip
total memory = 15996 KB
avail memory = 12056 KB
using 225 buffers containing 900 KB of memory
mainbus0 (root)
vsbus0 at mainbus0
vsbus0: 32K entry DMA SGMAP at PA 0x400000 (VA 0x80400000)
vsbus0: interrupt mask 0
le0 at vsbus0 csr 0x200e0000 vec 770 ipl 15 maskbit 1 buf 0x0-0xffff
le0: address 08:00:2b:32:0b:c6
le0: 32 receive buffers, 8 transmit buffers
dz0 at vsbus0 csr 0x200a0000 vec 124 ipl 15 maskbit 4
dz0: 4 lines
lkkbd0 at dz0
wskbd0 at lkkbd0
lkms0 at dz0
wsmouse0 at lkms0
asc0 at vsbus0 csr 0x200c0080 vec 774 ipl 15 maskbit 0
asc0: NCR53C94, 25MHz, SCSI ID 6
scsibus0 at asc0: 8 targets, 8 luns per target
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 3 lun 0: <DEC, RZ23L (C) DEC, 2528> SCSI1 0/direct fixed
sd0(asc0:3:0): max sync rate 3.96MB/s
sd0: 116 MB, 1523 cyl, 4 head, 39 sec, 512 bytes/sect x 237588 sectors
boot device: le0
root on le0
>Description:
Since about ~three (or maybe even more) months, every current kernel I made with and for
System 1 (VT1300) dies during boot with something like:
[...]
Creating runtime link editor directory cache.
Clearing /tmp.
Starting timed.
panic: kernel stack invalid
Stopped in pid 1 (init) at _trap+0x138: tstl 64(r8)
db> trace/t
Process 1
PCB contents:
KSP = 0x85710e98
ESP = 0x8570f064
SSP = 0x80271200
USP = 0x7ffffc70
R[00] = 0x85723000 R[06] = 0x8038f000
R[01] = 0x0002b918 R[07] = 0x00000000
R[02] = 0x804021c0 R[08] = 0x8038f000
R[03] = 0x004fb000 R[09] = 0x80271200
R[04] = 0x00000000 R[10] = 0x8038f000
R[05] = 0x00000000 R[11] = 0x00000000
AP = 0x85710ecc
FP = 0x85710ea0
PC = 0x80023674
PSL = 0xdf0008
Trap frame pointer: 0x85710fb4
Stack traceback :
0x85710ea0: bpendtsleep+0x0(0x8038f000,0x120,0x8001b57d,0x0,0x0)
0x85710ed4: _sys_wait4+0x2b2(0x8038f000,0x85710f60,0xc)
0x85710f1c: _syscall+0xf5(0x7ffffc70)
Because kernel compilation needs a week now (I don't know why, it used to
be a day with 1.5ALPHA*/BETA), I didn't investigate that further.
(Sorry that I can't give a precise date when exactly this starts to happen,
I accidently deleted my last working current kernel :-( .)
This week I got System 2 (VS4000/VLC) which can stay up a little bit longer,
but later panics as well when I do a "make depend" in a kernel compilation directory
(or something like that):
depending the kern library objects
depending the compat library objects
panic: kernel stack invalid
Stopped in pid 190 (cron) at _trap+0x138: tstl 64(r8)
db> trace/t
Process 190
PCB contents:
KSP = 0x86063e80
ESP = 0x86062064
SSP = 0x8028fe00
USP = 0x7ffffcac
R[00] = 0x86054000 R[06] = 0x80f72008
R[01] = 0x000302a0 R[07] = 0x00000001
R[02] = 0x804a7de4 R[08] = 0x80f72008
R[03] = 0x00a6b000 R[09] = 0x8028fe00
R[04] = 0x00000000 R[10] = 0x8013558c
R[05] = 0x00000000 R[11] = 0x00001771
AP = 0x86063eb4
FP = 0x86063e88
PC = 0x80023674
PSL = 0xdf0008
Trap frame pointer: 0x86063fb4
Stack traceback :
0x86063e88: bpendtsleep+0x0(0x8013558c,0x120,0x8002581b,0x1771,0x86063f28)
0x86063ebc: _sys_nanosleep+0xac(0x80f72008,0x86063f60,0x84)
0x86063f20: _syscall+0xf5(0x7ffffcac)
(That happens usually in cron, syslogd or rwhod.)
To test if it's a network problem, I installed the system on disk and made
sure that there was no single acces to network devices (I even removed the
tranceiver) - it happened again.
I thought it has to do with pagin/swapping since it seems to happen when a
(partly) paged out process is waked up, so I repeated the test without
configuring a swap device - made no difference (Well, it died a little bit
earlier I guess).
BTW: With only 8MB instead of 16MB RAM System 2 dies as early as System 1.
>How-To-Repeat:
Make a current kernel, boot on VT1300 (or VS 3100m30) or VS 4000/VLC or
maybe any VAX without much memory and try to build a kernel.
It's very easy for me to repeat, so if you need more information I could
gather from the debugger, please mail me.
>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted: