Subject: Re: Kernel Boot error
To: None <fubar22@gmail.com>
From: Johan A.van Zanten <johan@giantfoo.org>
List: port-sparc
Date: 12/29/2006 05:23:24
Isaac Wagner-Muns <fubar22@gmail.com> wrote:
> Hello, I'm running a sun sparcstation 20 and am not able to boot any  
> kernel i compile. even when i try to compile the GENERIC kernel, when  
> i try to boot the computer, it says something like:
> 
> 
> cpu0: booting secondary processors:	cpu1
> Thu Dec. 28 23:29:18 GMT 2006
> [1]	Bad system call			rcorder -s nosta...
> Thu Dec. 28 23:29:18 GMT 2006
> Dec 28 23:29:19	init:	kernel security level changed from 0 to 1
> Dec 28 23:29:19	init:	can't exec getty '/usr/libexec/getty' for port / 
> dev/consoy
> 
> the last line then just keeps repeating. when i try to boot off the  
> origionally installed kernel, the system functions fine. What's  
> wrong? Any help would be great.


I ran into some nearly-fatal errors trying to upgrade my only multi-cpu
sparc from 2.0.2 to 3.1_STABLE.  My other three uniprocessor sun4m
machines all upgraded just fine.

 The problem i was as follows:

I installed the 3.1 kernel on a dual 50 MHz SPARC-20 that has been
reliably running an MP 2.0.2 kernel for many, many months.  It's a busy
machine -- MX, DNS, KDC.

 After i rebooted, many boot-time programs started throwing SEGVs.  I
rebooted again, and it continued, though it was different programs.
Thinking it might be a library issue, i upgraded the userland to 3.1, but
the problems continued.

When i removed the second CPU from the system, the problems went away.  I
put a different 50 MHz CPU in, and the problems returned.

At this point, my best guess is that there is a serious problem in the MP
NetBSD 3.1 code on sparc, at least for the 50 MHz SuperSPARC CPUs i use.

 Unfortunately, the machine in question is production, so my best choice
was to bring it up leave it uniprocessor and running, rather than
troubleshoot MP anymore.

 I have another machine, a SPARC-10, that i'm planning to bring up MP and
see if i can reproduce the problem.

 What's strange to me is that programs (like ifconfig or /bin/sh) SEGV,
but the kernel doesn't panic.

 I tried my own kernel and the GENERIC.MP.

 Sorry this isn't much help.  The only suggestion that comes to mind is go
down to one CPU and see if the problems go away.

(uniprocessor) dmesg ouput below.

 -johan


Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.

NetBSD 3.1_STABLE (MANGOLASSI.MP) #0: Sun Nov  5 17:37:17 CST 2006
	johan@pangu:/tew/003/src/NetBSD/NetBSD-3/src/sys/arch/sparc/compile/MANGOLASSI.MP
total memory = 319 MB
avail memory = 308 MB
bootpath: /iommu@f,e0000000/sbus@f,e0001000/espdma@f,400000/esp@f,800000/sd@1,0
mainbus0 (root): SUNW,SPARCstation-20: hostid 7271a62a
cpu0 at mainbus0: TMS390Z50 v0 or TMS390Z55 @ 50 MHz, on-chip FPU
cpu0: physical 20K instruction (64 b/l), 16K data (32 b/l): cache enabled
obio0 at mainbus0
clock0 at obio0 slot 0 offset 0x200000: mk48t08
timer0 at obio0 slot 0 offset 0x300000: delay constant 23
zs0 at obio0 slot 0 offset 0x100000 level 12 softpri 6
zstty0 at zs0 channel 0 (console i/o)
zstty1 at zs0 channel 1
zs1 at obio0 slot 0 offset 0x0 level 12 softpri 6
kbd0 at zs1 channel 0: baud rate 1200
ms0 at zs1 channel 1: baud rate 1200
fdc0 at obio0 slot 0 offset 0x700000 level 11: no drives attached
auxreg0 at obio0 slot 0 offset 0x800000
power0 at obio0 slot 0 offset 0xa01000 level 2
iommu0 at mainbus0 ioaddr 0xe0000000: version 0x3/0x1, page-size 4096, range 64MB
sbus0 at iommu0: clock = 25 MHz
dma0 at sbus0 slot 15 offset 0x400000: DMA rev 2
esp0 at dma0 slot 15 offset 0x800000 level 4: ESP200, 40MHz, SCSI ID 7
scsibus0 at esp0: 8 targets, 8 luns per target
ledma0 at sbus0 slot 15 offset 0x400010: DMA rev 2
le0 at ledma0 slot 15 offset 0xc00000 level 6: address 08:00:20:71:a6:2a
le0: 8 receive buffers, 2 transmit buffers
SUNW,bpp at sbus0 slot 15 offset 0x4800000 level 2 (ipl 3) not configured
SUNW,DBRIe at sbus0 slot 14 offset 0x10000 level 9 not configured
hme0 at sbus0 slot 3 offset 0x8c00000 level 4 (ipl 7): Sun Happy Meal Ethernet (SUNW,hme)
hme0: Ethernet address 08:00:20:71:a6:2a
nsphy0 at hme0 phy 1: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
eccmemctl0 at mainbus0 ioaddr 0x0: version 0x0/0x2
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 1 lun 0: <SEAGATE, ST32550W SUN2.1G, 0418> disk fixed
sd0: 2048 MB, 3511 cyl, 11 head, 108 sec, 512 bytes/sect x 4194995 sectors
sd0: sync (100.00ns offset 15), 8-bit (10.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 3 lun 0: <SEAGATE, ST32430W SUN2.1G, 0508> disk fixed
sd1: 2049 MB, 3992 cyl, 9 head, 116 sec, 512 bytes/sect x 4197405 sectors
sd1: sync (100.00ns offset 15), 8-bit (10.000MB/s) transfers, tagged queueing
root on sd0a dumps on sd0b
root file system type: ffs
cpu0: booting secondary processors: