Subject: Re: Machine hangs
To: Andrey Petrov <petrov@netbsd.org>
From: Sten Spans <sten@blinkenlights.nl>
List: port-sparc64
Date: 01/13/2004 20:57:48
On Mon, 12 Jan 2004, Andrey Petrov wrote:

> On Sat, Jan 10, 2004 at 04:26:39PM +0000, Steve Doyle wrote:
> >
> > Evening,
> >
> > 	I'm having a problem with the machine that I'd really appreciate
> > some assistance with. From time to time, normally after a couple of
> > weeks uptime, something seems to be killing the box stone dead without
> > any warning, I've been through the logfiles after the last crash and
> > so far it's turned up nothing that would shed some light on the cause.
> >
>
> Can you enter ddb? <Stop-A> from keyboard or 'break' from serial console.
> If you can, please post output from 't' and 'ps' commands.

I have an ultra5 ( 270mhz, 512mb, symbios scsi ) which seems to experience
this problem as well, its pretty easy to trigger. The machine hasn't
completed a ./build.sh distribution yet. This is with a recent current
kernel and softupdates enabled.

The sympthoms are that nbmake suddenly sleeps forever and the
load keeps going up ( ctrl-t ) until entering ddb breaks as well.
Here follows a paste from a serial console session ( the machine
has no configured network interface ).

cleandir ===> gnu/usr.bin/gcc3/gcc
rm -f gccspec.o  gccspec.ln
rm -f a.out [Ee]rrs mklog core *.core .gdbinit gcc
rm -f gcc.cat1
rm -f .depend gccspec.d /usr/src/gnu/usr.bin/gcc3/gcc/tags
rm -f    gcc.info
cleandir ===> gnu/usr.bin/gcc3/include
rm -f a.out [Ee]rrs mklog core *.core .gdbinit
cleandir ===> gnu/usr.bin/gcc3/protoize
load: 1.66  cmd: nbmake 9396 [running] 0.00u 16.10s 53% 0k
load: 1.77  cmd: nbmake 9396 [running] 0.00u 23.96s 68% 0k
load: 1.96  cmd: nbmake 9396 [running] 0.00u 29.76s 76% 0k
load: 2.19  cmd: nbmake 9396 [running] 0.00u 48.34s 90% 0k
load: 2.25  cmd: nbmake 9396 [running] 0.00u 53.51s 92% 0k
load: 3.02  cmd: nbmake 9396 [running] 0.00u 86.20s 98% 0k
kdb breakpoint at 12f8384
Stopped in pid 9396.1 (nbmake) at       netbsd:cpu_Debugger+0x4:       nop
db> t
sab_intr(2501700, 0, e0017ed0, 0, 12cbdf0, 0) at netbsd:sab_intr+0xa4
?(c67c840, 0, ffffffffffffffff, 1, c67bb80, 1000000) at 0x1008f94
ctx_alloc(c67c780, c67c780, 0, ffffffffffffffff, 0, da2b700) at netbsd:ctx_alloc+0x78
pmap_activate_pmap(c67c780, ffffffffffffffff, 1009d7c, b6e2000, 0, 80080d) at netbsd:pmap_activate_pmap+0x14
uvmspace_exec(c67bce0, 0, ffffffffffffffff, da2bab0, 0, 1364230) at netbsd:uvmspace_exec+0x44
sys_execve(0, da2bdd0, 0, 0, 267cb00, da36010) at netbsd:sys_execve+0x59c
syscall(da2bed0, 3b, 4043f45c, da2bdd0, 0, 4043f460) at netbsd:syscall+0xcc
?(232ef0, ffffffffffffce80, ffffffffffffd848, 0, 0, 0) at 0x1008cb8
db> ps
PID           PPID     PGRP        UID S   FLAGS LWPS          COMMAND    WAIT
>9396          5609      284          0 2 0x100012    1           nbmake
 5609          4259      284          0 2  0x4002    1           nbmake  ppwait
 4259          5479      284          0 2  0x4002    1               sh    wait
 5479         12977      284          0 2  0x4002    1           nbmake  piperd
 12977        12511      284          0 2  0x4002    1               sh    wait
 12511         6090      284          0 2  0x4002    1           nbmake    wait
 6090          8749      284          0 2  0x4002    1               sh    wait
 8749          5213      284          0 2  0x4002    1           nbmake    wait
 5213          7953      284          0 2  0x4002    1               sh    wait
 7953          8016      284          0 2  0x4002    1           nbmake    wait
 8016          2263      284          0 2  0x4002    1               sh    wait
 2263          2742      284          0 2  0x4002    1           nbmake    wait
 2742          2346      284          0 2  0x4002    1               sh    wait
 2346          2571      284          0 2  0x4002    1           nbmake    wait
 2571          2613      284          0 2  0x4002    1               sh    wait
 2613           284      284          0 2  0x4002    1           nbmake    wait
 284            282      284          0 2  0x4002    1               sh    wait
 282              1      282          0 2  0x4002    1              csh   pause
 271              1      271          0 2       0    1             cron
 254              1      254          0 2       0    1            inetd   pause
 123              1      123          0 2       0    1          syslogd
 9                0        0          0 2 0x20200    1         aiodoned aiodone
 8                0        0          0 2 0x20200    1          ioflush
 7                0        0          0 2 0x20200    1       pagedaemon pgdaemo
 6                0        0          0 2 0x20200    1       lfs_writer lfswrit
 5                0        0          0 2 0x20200    1        atapibus0  sccomp
 4                0        0          0 2 0x20200    1         scsibus0  sccomp
 3                0        0          0 2 0x20200    1          atabus1   atath
 2                0        0          0 2 0x20200    1          atabus0   atath
 1                0        1          0 2  0x4000    1             init    wait
 0               -1        0          0 2 0x20200    1          swapper
db> continue
load: 3.10  cmd: nbmake 9396 [running] 0.00u 89.62s 98% 0k
load: 3.10  cmd: nbmake 9396 [running] 0.00u 94.36s 98% 0k
load: 6.84  cmd: nbmake 9396 [running] 0.00u 107.68s 99% 0k
load: 6.84  cmd: nbmake 9396 [running] 0.00u 109.32s 99% 0k
kdb breakpoint at 12f8384
Stopped in pid 9396.1 (nbmake) at       netbsd:cpu_Debugger+0x4:        nop
db> t
sab_intr(2501700, 0, e0017ed0, 0, 12cbdf0, 0) at netbsd:sab_intr+0xa4
?(c67c840, 0, ffffffffffffffff, 1, c67bb80, 1000000) at 0x1008f94
ctx_alloc(c67c780, c67c780, 0, ffffffffffffffff, 0, da2b700) at
netbsd:ctx_alloc+0x78
pmap_activate_pmap(c67c780, ffffffffffffffff, 1009d7c, b6e2000, 0, 80080d)
at netbsd:pmap_activate_pmap+0x14
uvmspace_exec(c67bce0, 0, ffffffffffffffff, da2bab0, 0, 1364230) at
netbsd:uvmspace_exec+0x44
sys_execve(0, da2bdd0, 0, 0, 267cb00, da36010) at netbsd:sys_execve+0x59c
syscall(da2bed0, 3b, 4043f45c, da2bdd0, 0, 4043f460) at netbsd:syscall+0xcc
?(232ef0, ffffffffffffce80, ffffffffffffd848, 0, 0, 0) at 0x1008cb8
db> continue
load: 8.87  cmd: nbmake 9396 [running] 0.00u 115.62s 99% 0k
load: 11.33  cmd: nbmake 9396 [running] 0.00u 133.18s 99% 0k
load: 12.03  cmd: nbmake 9396 [running] 0.00u 134.78s 99% 0k

etc.

-- 
Sten Spans

"There is a crack in everything, that's how the light gets in."
Leonard Cohen - Anthem