Subject: Re: SMP status
To: Martin Husemann <martin@duskware.de>
From: Chris Ross <cross+netbsd@distal.com>
List: port-sparc64
Date: 08/26/2007 03:03:01
On Aug 25, 2007, at 15:47, Martin Husemann wrote:
> Hi folks,
>
> I think a -current GENERIC.MP kernel should be able to boot into
> single user shell on all supported machines. Beware, it will crash
> soon if you do serious stuff ;-)

   Okay.  First attempt at getting to single user failed.  Note, I'm  
net-booting, and have NFS root.  This is probably a lot to ask of it,  
but.  :-)

   On first try, I see:

Using BOOTP protocol: ip address: 206.138.151.49, netmask:  
255.255.255.0, gateway: 206.138.151.1
root addr=206.138.151.36 path=/export/sparc64/nfsroot
=0x857090
Loading netbsd: 6492640+356432+321192 [489456+314634]=0x96a2b8
sparc64_init(0xf005eaf0, 0xfffb1e28, 0x20, 0xf005eaf0, 0xf005eaf0)
sparc64_init: bmagic=44444230, bi=0x196a2c0
xtlb[0]: Tag: 1000000 Data: e00000001f800076
xtlb[1]: Tag: 1400000 Data: e00000001f400076
xtlb[2]: Tag: 1800000 Data: e00000001f000076
console is /sbus@1f,0/zs@f,1100000:a
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004,  
2005,
     2006, 2007
     The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
     The Regents of the University of California.  All rights reserved.

NetBSD 4.99.29 (GENERIC.MP) #1: Sun Aug 26 00:05:49 EDT 2007
         cross@skaro.distal.com:/data/obj/NetBSD.sparc64/data/NetBSD/ 
src/sys/arch/sparc64/compile/GENERIC.MP
total memory = 512 MB
avail memory = 489 MB
bootpath: /sbus@1f,0/SUNW,hme@e,8c00000
mainbus0 (root): SUNW,Ultra-2: hostid 80a18bf8
cpu0 at mainbus0: SUNW,UltraSPARC @ 199.996 MHz, UPA id 0
cpu0: 32K instruction (32 b/l), 16K data (32 b/l), 1024K external (64  
b/l)
cpu1 at mainbus0: SUNW,UltraSPARC @ 199.996 MHz, UPA id 1
cpu1: 32K instruction (32 b/l), 16K data (32 b/l), 1024K external (64  
b/l)
timer0 at mainbus0 addr 0xfffc3c00 irq vectors 7f0 and 7f1
sbus0 at mainbus0 addr 0xfffc8000: clock = 25 MHz
DVMA map: ff800000 to ffffe000
IOTSB: 858000 to 85a000
audiocs0 at sbus0 slot 13 offset 0xc000000 vector 24 ipl 8: CS4231A
audio0 at audiocs0: full duplex
auxio0 at sbus0 slot 15 offset 0x1900000
flashprom at sbus0 slot 15 offset 0x0 not configured
fdc0 at sbus0 slot 15 offset 0x1400000 vector 29 ipl 11: no drives  
attached
clock0 at sbus0 slot 15 offset 0x1200000: mk48t59
zs0 at sbus0 slot 15 offset 0x1100000 vector 28 ipl 12 softpri 6
zstty0 at zs0 channel 0 (console i/o)
zstty1 at zs0 channel 1
zs1 at sbus0 slot 15 offset 0x1000000 vector 28 ipl 12 softpri 6
zstty2 at zs1 channel 0
kbd0 at zstty2
zstty3 at zs1 channel 1
ms0 at zstty3
wsmouse0 at ms0 mux 0
sc at sbus0 slot 15 offset 0x1300000 not configured
SUNW,pll at sbus0 slot 15 offset 0x1304000 not configured
esp0 at sbus0 slot 14 offset 0x8800000 vector 20 ipl 3: FAS366/HME,  
40MHz, SCSI ID 7
scsibus0 at esp0: 16 targets, 8 luns per target
hme0 at sbus0 slot 14 offset 0x8c00000 vector 21 ipl 6: Sun Happy  
Meal Ethernet (SUNW,hme)
hme0: Ethernet address 08:00:20:a1:8b:f8
nsphy0 at hme0 phy 1: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
bpp0 at sbus0 slot 14 offset 0xc800000 vector 22 ipl 2: DMA rev  
unknown (0x20000000)
nf at sbus0 slot 3 offset 0x7ff0 vector 4 ipl 6 not configured
pcons at mainbus0 not configured
wskbd0 at kbd0 mux 1
scsibus0: waiting 2 seconds for devices to settle...
sd0 at scsibus0 target 1 lun 0: <SEAGATE, ST39173W SUN9.0G, 7863>  
disk fixed
sd0: 8637 MB, 4926 cyl, 27 head, 133 sec, 512 bytes/sect x 17689267  
sectors
sd0: sync (100.00ns offset 15), 16-bit (20.000MB/s) transfers, tagged  
queueing
kbd0: reset failed
Kernelized RAIDframe activated
root on hme0
nfs_boot: trying DHCP/BOOTP
nfs_boot: DHCP next-server: 206.138.151.36
nfs_boot: my_domain=distal.com
nfs_boot: my_addr=206.138.151.49
nfs_boot: my_mask=255.255.255.0
nfs_boot: gateway=206.138.151.1
root on 206.138.151.36:/export/sparc64/nfsroot
root file system type: nfs
/etc/rc.conf is not configured.  Multiuser boot aborted.
Enter pathname of shell or RETURN for /bin/sh: cpu1: data fault:  
pc=13f2da0 addr=0
kernel trap 30: data access exception
cpu0: kdb breakpoint at 1008ee4
cpu0 paused.
Stopped at      netbsd:pmap_activate:   ldx             [%o0 +  
0x148], %g1
db{1}>


   Note, I didn't do anything when it prompted me, I wasn't watching  
it.  I came back and found this.  When I type reboot to ddb, I see:

db{1}> reboot
syncing disks... cpu1: kdb breakpoint at 13f95f4
cpu1: kdb breakpoint at 13f95f4

...and it just hangs there.  A manual power-cycle is needed.  Is this  
a known problem?

   Anyway, after loading the kernel again from the NFS root, the same  
failure occurs.  It seems to take about 8 seconds from the point  
where it prompts me for which shell I want, until it crashes saying:

cpu1: data fault: pc=13f2da0 addr=0
kernel trap 30: data access exception
cpu0: kdb breakpoint at 1008ee4
cpu0 paused.
Stopped at      netbsd:pmap_activate:   ldx             [%o0 +  
0x148], %g1
db{1}>

   As you see from the above dmesg, this is a pair of 200 Mhz US-I's.

   Let me know if there's anything else I can try or let you know for  
debugging.   Thanks!

                                                             - Chris