Subject: Ultra 1 hangs almost solidly after some time
To: None <port-sparc64@netbsd.org>
From: Erik Bertelsen <bertelsen.erik@gmail.com>
List: port-sparc64
Date: 10/12/2006 22:06:04
I have an elderly Ultra 1 167MHz that I try to update to
NetBSD-current from time to time.

For any kernel built after 25 Sep 2006 I have had the problem that
some time after boot it will hang. When it hangs, it still answers to
ping and when I enter Ctrl/T on the serial console connection, it
responds like in:

     load: 5.99  cmd: sh 18239 [runnable] 0.01u 0.01s 0% 1184k

It appears that it will never start new processes and maybe not even
switch to other processes when this happens. Ctrl/C or Ctrl/Z does not
give a shell prompt and the machine does not accept new network
connections.

If I have e.g. top running before the "crash" it will continue
updating the screen and show no significant cpu usage by any process.

For a while I suspected hardware problems and pulled most of the ram,
but the problem remained. I exchanged some of the ram pulled with the
remaining ram and also mounted it on other slots, but the problem
stays.

With only 64MB of RAM, it will always hang as described before completing a

      sh build.sh tools

I finally tried to reboot with a kernel built on 18 Sep 2006 and with
this kernel the machine can now complete a full

     sh build.sh build

Updating to current sources as of today and building a fresh kernel
from the updated sources, caused the problem to re-appear.

(the above was a few days ago)

Today I again updated my sources and built a new kernel, now with
options DEBUG and DIAGNOSTIC. With this kernel, I got a crash already
when building make as the first action of build.sh:

checking how to run the C preprocessor... cc -E
checking for regex.h... yes
checking for poll.h... yes
checking for regfree in -lregex... no
checking for library containing regfree... none required
checking for setenv... yes
checking for strdup... yes
checking for strerror... yes
checking for strftime... yes
checking for vsnprintf... yes
configure: creating ./config.status
config.status: creating buildmake.sh
cc  -O -D_PATH_DEFSHELLDIR="/bin" -D_BASENAME_DEFSHELL="sh"
-DHAVE_SETENV=1 -DHAVE_STRDUP=1 -DHAVE_STRERROR=1 -DHAVE_STRFTIME=1
-DHAVE_VSNPRINTF=1  -c
/home/NetBSD/src/tools/make/../../usr.bin/make/arch.c
trap type 0x34: cpu 0, pc=10d3160 npc=10d3164 pstate=820006<PRIV,IE>
kernel trap 34: mem address not aligned
Stopped in pid 0.1 (swapper) at netbsd:uvm_scheduler+0x80:      ldx
         [
%g5 + 0x50], %g1
db> t
db> bt
db>



Do anyone have any explanation or just ideas of what I can do to help
identify the problem better.

Regards
- Erik Bertelsen