Subject: port-sparc64/11509: Ultra/1 dies under load
To: None <gnats-bugs@gnats.netbsd.org>
From: None <bouyer@antioche.lip6.fr>
List: netbsd-bugs
Date: 11/16/2000 08:50:19
>Number: 11509
>Category: port-sparc64
>Synopsis: Ultra/1 dies under load
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-sparc64-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Nov 16 08:50:00 PST 2000
>Closed-Date:
>Last-Modified:
>Originator:
>Release: 1.5_BETA2 snapshot
>Organization:
LIP6, Universite Paris VI.
>Environment:
Sun Ultra 1 SBus (UltraSPARC 143MHz), No Keyboard
OpenBoot 3.11, 128 MB memory installed, Serial #8793333.
Ethernet address 8:0:20:86:2c:f5, Host ID: 80862cf5.
Rebooting with command: boot
Boot device: disk1 File and args:
NetBSD IEEE 1275 Bootblock
..>> NetBSD/sparc64 OpenFirmware Boot, Revision 1.2
>Description:
I started recompiling a kernel with 'make -j24'. simple_lock
and UVM debug message started on the console, and after a while
the machine paniced. Here's what was printed on console and a few
data I gathered from ddb:
login: uvn_detach: vn 0x63f35e0 has pages left after flush - relkill mode
simple_lock: lock held
lock: 0x149a280, currently at: /home5/src/src/sys/uvm/uvm_pdaemon.c:867
last locked: /home5/src/src/sys/uvm/uvm_vnode.c:697
last unlocked: /home5/src/src/sys/uvm/uvm_vnode.c:702
uvn_detach: vn 0x63f35e0 has pages left after flush - relkill mode
simple_lock: lock held
lock: 0x149a280, currently at: /home5/src/src/sys/uvm/uvm_pdaemon.c:867
last locked: /home5/src/src/sys/uvm/uvm_vnode.c:697
last unlocked: /home5/src/src/sys/uvm/uvm_vnode.c:702
uvn_detach: vn 0x63f35e0 has pages left after flush - relkill mode
simple_lock: lock held
lock: 0x149a280, currently at: /home5/src/src/sys/uvm/uvm_pdaemon.c:867
last locked: /home5/src/src/sys/uvm/uvm_vnode.c:697
last unlocked: /home5/src/src/sys/uvm/uvm_vnode.c:1221
uvn_detach: vn 0x63f35e0 has pages left after flush - relkill mode
simple_lock: lock held
lock: 0x149a280, currently at: /home5/src/src/sys/uvm/uvm_pdaemon.c:867
last locked: /home5/src/src/sys/uvm/uvm_vnode.c:697
last unlocked: /home5/src/src/sys/uvm/uvm_vnode.c:702
data fault: pc=126a194 addr=20000000000 sfsr=%qb
kernel trap 30: data access exception
Stopped in top at pmap_count_res+0xe4: stwa %g0, [%o1 + %g0]
71
db>
db> tr
db>
So no backtrace available here
db> ps
PID PPID PGRP UID S FLAGS COMMAND WAIT
4016 3999 3998 331 2 0x4006 as
4015 3995 3994 331 2 0x4006 as
4014 3927 3926 331 2 0x4006 as
4013 4012 4011 331 2 0x4006 cpp
4012 4011 4011 331 3 0x4086 cc wait
4011 3746 4011 331 3 0x4082 sh wait
4006 4003 4002 331 2 0x4006 cc1
4003 4002 4002 331 3 0x4082 cc wait
4002 3746 4002 331 3 0x4082 sh wait
3999 3998 3998 331 3 0x4086 cc wait
3998 3746 3998 331 3 0x4082 sh wait
3997 3988 3986 331 2 0x4006 cc1
3995 3994 3994 331 3 0x4086 cc wait
3994 3746 3994 331 3 0x4082 sh wait
3992 3980 3979 331 2 0x4006 cc1
3991 3982 3981 331 2 0x4006 cc1
3988 3986 3986 331 3 0x4082 cc wait
3986 3746 3986 331 3 0x4082 sh wait
3982 3981 3981 331 3 0x4082 cc wait
3981 3746 3981 331 3 0x4082 sh wait
3980 3979 3979 331 3 0x4082 cc wait
3979 3746 3979 331 3 0x4082 sh wait
3978 3973 3971 331 2 0x4006 cc1
3976 3966 3964 331 2 0x4006 cc1
3975 3969 3967 331 2 0x4006 cc1
3973 3971 3971 331 3 0x4082 cc wait
3971 3746 3971 331 3 0x4082 sh wait
3969 3967 3967 331 3 0x4082 cc wait
3967 3746 3967 331 3 0x4082 sh wait
3966 3964 3964 331 3 0x4082 cc wait
3965 3959 3957 331 2 0x4006 cc1
3964 3746 3964 331 3 0x4082 sh wait
3963 3954 3952 331 2 0x4006 cc1
3959 3957 3957 331 3 0x4082 cc wait
3957 3746 3957 331 3 0x4082 sh wait
3954 3952 3952 331 3 0x4082 cc wait
3953 3940 3938 331 2 0x4006 cc1
3952 3746 3952 331 3 0x4082 sh wait
3940 3938 3938 331 3 0x4082 cc wait
3938 3746 3938 331 3 0x4082 sh wait
3933 3924 3922 331 2 0x4006 cc1
3927 3926 3926 331 3 0x4086 cc wait
3926 3746 3926 331 3 0x4082 sh wait
3925 3913 3910 331 2 0x4006 cc1
3924 3922 3922 331 3 0x4082 cc wait
3922 3746 3922 331 3 0x4082 sh wait
3921 3905 3901 331 2 0x4006 cc1
3920 3907 3903 331 2 0x4006 cc1
3917 3894 3893 331 2 0x4006 cc1
3913 3910 3910 331 3 0x4082 cc wait
3910 3746 3910 331 3 0x4082 sh wait
3907 3903 3903 331 3 0x4082 cc wait
3905 3901 3901 331 3 0x4082 cc wait
3904 3875 3873 331 2 0x4006 cc1
3903 3746 3903 331 3 0x4082 sh wait
3902 3881 3879 331 2 0x4006 cc1
3901 3746 3901 331 3 0x4082 sh wait
3894 3893 3893 331 3 0x4082 cc wait
3893 3746 3893 331 3 0x4082 sh wait
3881 3879 3879 331 3 0x4082 cc wait
3879 3746 3879 331 3 0x4082 sh wait
3875 3873 3873 331 3 0x4082 cc wait
3873 3746 3873 331 3 0x4082 sh wait
3833 3775 3773 331 2 0x4006 cc1
3829 3768 3765 331 2 0x4006 cc1
3828 3760 3757 331 2 0x4006 cc1
3775 3773 3773 331 3 0x4082 cc wait
3773 3746 3773 331 3 0x4082 sh wait
3768 3765 3765 331 3 0x4082 cc wait
3765 3746 3765 331 3 0x4082 sh wait
3760 3757 3757 331 3 0x4082 cc wait
3757 3746 3757 331 3 0x4082 sh wait
>How-To-Repeat:
compile a kernel with 'make -j24'
>Fix:
unknown yet
>Release-Note:
>Audit-Trail:
>Unformatted:
>> (mrg@powerofseven.eterna.com.au, Fri Aug 18 18:15:28 EST 2000)
devopen: getdisklabel sez no disk label
loadfile: reading header
elf64_exec: Booting /sbus@1f,0/espdma@e,8400000/esp@e,8800000/sd@1,0:a/netbsd
3028648@0x1000000+219984@0x1400000+433440@0x1435b50
symbols @ 0xfff0e300 74+242976+125598 start=0x1000000
chain: calling OF_chain(800000, f0f0, 1000000, fffb5a80, 18)
[ preserving 369424 bytes of netbsd ELF symbol table ]
pmap_bootstrap: could not claim physical dseg extention at 138a0000 size 360000
consinit()
setting up stdin
stdin instance = fffe60f0
setting up stdout
stdout instance = fffe64b8
stdout package = f005a538
console is unknown
Copyright (c) 1996, 1997, 1998, 1999, 2000
The NetBSD Foundation, Inc. All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
NetBSD 1.5_BETA2 (GENERIC) #11: Wed Nov 15 10:29:50 PST 2000
eeh@nonplus.one-o.com:/home5/src/src/sys/arch/sparc64/compile/GENERIC
total memory = 128 MB
avail memory = 109 MB
>3751 254 3751 331 7 0x4006 top
3746 949 3746 331 2 0x4486 make
949 917 949 331 3 0x4082 tcsh pause
917 178 178 0 3 0x180 sshd select
254 253 254 331 3 0x4082 tcsh pause
253 178 178 0 3 0x184 sshd select
252 1 252 0 3 0x4082 getty ttyin
186 1 186 0 3 0x80 cron nanosle
183 1 183 0 3 0x80 inetd select
178 1 178 0 3 0x80 sshd select
104 1 104 0 2 0x84 syslogd
4 0 0 0 2 0x20204 ioflush
3 0 0 0 3 0x20204 reaper reaper
2 0 0 0 3 0x20204 pagedaemon daemon_
1 0 1 0 3 0x4080 init wait
0 -1 0 0 3 0x20204 swapper schedul
db>