Subject: Re: Qube2 "crash" every few days during the daily script
To: None <remi_zara@mac.com>
From: Izumi Tsutsui <tsutsui@ceres.dti.ne.jp>
List: port-cobalt
Date: 05/31/2007 19:38:53
remi_zara@mac.com wrote:
> > I've just create a kernel with fixed trace command and 20070420
> > source tree, so could you try this one?
> > http://www.ceres.dti.ne.jp/~tsutsui/netbsd/netbsd-cobalt-
> > tracefix-20070420.gz
>
> It's up and running. Now I just have to wait for it to crash....
Well, (un)fortunately, the similar panic also happened on my RaQ2.
It got a TLB miss actually in a ksyms function during daily cron job
with a background job copying files from NFS to local wd0:
---
:
colt-% tar clf - . | ( cd /usr/local/tmp ; tar xf - )
trap: TLB miss (load or instr. fetch) in kernel mode
status=0x3, cause=0x8008, epc=0x801d3fdc, vaddr=0x7fb7e63a
pid=1307 cmd=netstat usp=0x7fffd4b0 ksp=0xcc741c28
Stopped in pid 1307.1 (netstat) at netbsd:ksyms_getname+0x318: \
lb v1,0(v0)
db> tr
ksyms_getname+318 (c0159a80,8039c835,c0112ffa,c0159a80) ra 801d4084 sz 16
ksyms_getname+3c0 (c0159a80,8039c835,c0112ffa,c0159a80) ra 0 sz 16
User-level: pid 1307.1
db> ps/w
PID COMMAND EMUL PRI UTIME STIME WAIT-MSG WAIT-CHANNEL
1270 awk netbsd 14 0.4 0.4 pipe 0x8fc69ab0
>1307 netstat netbsd 60 0.0 0.9
1810 showq netbsd 24 0.3 0.7 select netbsd:selwait
1709 postdrop netbsd 24 0.2 0.7 netio 0x8ed4fa08
894 sendmail netbsd 8 0.3 0.7 pipe 0x8fc69b38
864 tee netbsd 8 0.3 0.6 pipe 0x8fc699a0
1099 sh netbsd 13 0.7 0.3 wait 0x8efcf238
1241 sh netbsd 10 0.5 0.5 wait 0x8efcf068
1629 cron netbsd 8 0.6 0.3 pipe 0x8fc69918
1626 tar netbsd 82 154.4 358.4
1105 tar netbsd 24 68.9 195.9 pipe 0x8fc69d58
805 pickup netbsd 24 0.6 0.6 select netbsd:selwait
829 tcsh netbsd 27 1.2 0.9 pause 0x8fdf3000
755 login netbsd 10 0.8 0.6 wait 0x8fd19d08
693 cron netbsd 8 1.3 0.3 nanoslp 0x8f45bc40
781 inetd netbsd 24 0.1 0.8 kqread 0x8f8a7cb8
443 qmgr netbsd 24 0.5 0.9 select netbsd:selwait
826 master netbsd 24 2.4 1.5 select netbsd:selwait
479 sshd netbsd 24 6.7 0.5 select netbsd:selwait
441 ntpd netbsd 8 13.7 6.7 pause 0x8f45b940
305 amd netbsd 24 23.4 10.8 select netbsd:selwait
276 nfsio netbsd 32 21.0 21.0 nfsidl netbsd:nfs_asyncdaemon+0x38
268 nfsio netbsd 32 43.9 43.9 nfsidl netbsd:nfs_asyncdaemon+0x28
270 nfsio netbsd 32 242.2 242.2 nfsidl netbsd:nfs_asyncdaemon+0x18
272 nfsio netbsd 32 305.9 305.9 nfsidl netbsd:nfs_asyncdaemon+0x8
266 mount_mfs netbsd 32 0.0 0.4 mfsidl 0x8f85f1f8
247 ypbind netbsd 24 12.5 11.3 select netbsd:selwait
235 rpcbind netbsd 24 0.6 0.5 poll netbsd:selwait
188 syslogd netbsd 24 0.7 0.7
30 physiod netbsd 16 0.2 0.2 physiod 0xcbeb7048
8 aiodoned netbsd 4 64.5 64.5 aiodoned 0xcbeb7010
7 ioflush netbsd 40 21.4 21.4 syncer netbsd:fpu_id+0x1a138
6 pagedaemon netbsd 4 55.3 55.3 pgdaemon netbsd:uvm+0x14
5 cryptoret netbsd 36 0.2 0.2 crypto_wait netbsd:M_CRYPTO_DATA+0x34
4 atabus1 netbsd 16 0.0 0.4 atath 0xc00f5bb0
3 atabus0 netbsd 16 0.2 0.2 atath 0xc00f5a40
2 scsibus0 netbsd 16 0.2 0.2 sccomp 0xc005b6bc
1 init netbsd 15 0.1 0.4 wait 0x8fdfbd00
0 swapper netbsd 4 0.5 0.5 schedule netbsd:uvm+0x48
db>
---
I'm not sure if this is cobalt specific or MI ksyms(4) problem,
but this kernel doen't contain symbols for local (static)
functions so I'll try to reproduce it on a debug kernel
(with src/sys/arch/mips/conf/Makefile.mips rev 1.46).
---
Izumi Tsutsui