Subject: ddb and kernel not auto-selecting root device
To: None <port-sparc64@netbsd.org>
From: john heasley <heas@shrubbery.net>
List: port-sparc64
Date: 10/09/2001 10:57:35
i have a sparcengine which fails to auto-select the root device at
boot time.

NetBSD 1.5Y (sky) #0: Tue Oct  9 02:25:25 UTC 2001
    root@sky:/home/src/sys/arch/sparc64/compile/sky
total memory = 128 MB
avail memory = 109 MB
using 832 buffers containing 6656 KB of memory
bootpath: /pci@1f,0/pci@1,0/scsi@1,0/disk@0,0
mainbus0 (root): SUNW,UltraSPARC-IIi-Engine
cpu0 at mainbus0: SUNW,UltraSPARC-IIi @ 440.127 MHz, version 0 FPU
cpu0: physical 4K instruction (32 b/l), 4K data (32 b/l), 2048K external (64 b/l) 
psycho0 at mainbus0 addr 0xfffc0000
SUNW,sabre: impl 0, version 0: ign 7c0 bus range 0 to 128; PCI bus 0
intr_establish: intr reused 7c0
DVMA map: c0000000 to e0000000
pci0 at psycho0
pci0: i/o space, memory space enabled
ppb0 at pci0 dev 1 function 1: Sun Microsystems Simba PCI bridge (rev. 0x13)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
ebus0 at pci1 dev 1 function 0
ebus0: Sun Microsystems PCIO Ebus2, revision 0x01
auxio0 at ebus0 addr 726000-726003, 728000-728003, 72a000-72a003, 72c000-72c003, 72f000-72f003
power at ebus0 addr 724000-724003 ipl 37 not configured
SUNW,pll at ebus0 addr 504000-504002 not configured
se at ebus0 addr 400000-40007f ipl 43 not configured
com0 at ebus0 addr 3803f8-3803ff ipl 41: ns16550a, working fifo
kbd0 at com0
com1 at ebus0 addr 3602f8-3602ff ipl 42: ns16550a, working fifo
ms0 at com1
lpt0 at ebus0 addr 340278-340287, 30015c-30015d, 700000-70000f ipl 34
fdthree at ebus0 addr 3203f0-3203f7, 706000-70600f, 720000-720003 ipl 39 not configured
clock0 at ebus0 addr 0-1fff: mk48t59: hostid 80c62701
flashprom at ebus0 addr 0-fffff not configured
beeper at ebus0 addr 722000-722003 not configured
SUNW,rasctrl at ebus0 addr 600000-600003 ipl 40 ipl 37 not configured
hme0 at pci1 dev 1 function 1: Sun Happy Meal Ethernet, rev. 1
hme0: interrupting at ivec 3021
hme0: Ethernet address 08:00:20:c6:27:01
nsphy0 at hme0 phy 1: DP83840 10/100 media interface, rev. 1
nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ppb1 at pci0 dev 1 function 0: Sun Microsystems Simba PCI bridge (rev. 0x13)
pci2 at ppb1 bus 128
pci2: i/o space, memory space enabled
siop0 at pci2 dev 1 function 0: Symbios Logic 53c875 (ultra-wide scsi)
siop0: using on-board RAM
siop0: interrupting at ivec 20
scsibus0 at siop0: 16 targets, 8 luns per target
siop1 at pci2 dev 1 function 1: Symbios Logic 53c875 (ultra-wide scsi)
siop1: using on-board RAM
intr_establish: intr reused 7e0
siop1: interrupting at ivec 20
scsibus1 at siop1: 16 targets, 8 luns per target
Sun Microsystems PCIO Ebus2 (miscellaneous bridge, revision 0x01) at pci2 dev 2 function 0 not configured
hme1 at pci2 dev 2 function 1: Sun Happy Meal Ethernet, rev. 1
hme1: interrupting at ivec 3001
hme1: Ethernet address 08:00:20:c6:27:01
nsphy1 at hme1 phy 1: DP83840 10/100 media interface, rev. 1
nsphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pcons0 at mainbus0
No counter-timer -- using %tick at 440MHz as system clock.
Using %tick -- intr in 4401275 cycles...done.
scsibus0: waiting 2 seconds for devices to settle...
siop0: alloc newcdb at PHY addr 0xc0034000
sd0 at scsibus0 target 0 lun 0: <IBM, DGHS09Y, 03E0> SCSI3 0/direct fixed
sd0: 8748 MB, 8152 cyl, 10 head, 219 sec, 512 bytes/sect x 17916240 sectors
sd0: sync (50.0ns offset 16), 8-bit (20.000MB/s) transfers, tagged queueing
sd1 at scsibus0 target 1 lun 0: <IBM, DGHS09Y, 03E0> SCSI3 0/direct fixed
sd1: 8748 MB, 8152 cyl, 10 head, 219 sec, 512 bytes/sect x 17916240 sectors
sd1: sync (50.0ns offset 16), 8-bit (20.000MB/s) transfers, tagged queueing
scsibus1: waiting 2 seconds for devices to settle...
siop1: alloc newcdb at PHY addr 0xc0036000
root device: sd0a
dump device (default sd0b): 
file system (default generic): 

the last time i encountered this, it was a missing entry for SUNW/fas for
bus_compatible().

i'm a ddb novice and tend to muddle when forced to use it, so i'm not sure
if it is pilot error, change in the way the pci based boxes determine bus
compatibility/root device eligibility, or a ddb error.  but, setting a
break point at bus_compatible only breaks during the cpu configuration.

kdb breakpoint at 11d70a4
1 tt=1ff tstate=0 tpc=0x0 tnpc=0x0
Stopped in pid 0 (swapper) at   bus_compatible+0x8:     call            bus_clas
s
db> c
: SUNW,UltraSPARC-IIi @ 440.127 MHz, version 0 FPU
cpu0: physical 4K instruction (32 b/l), 4K data (32 b/l), 2048K external (64 b/l) 
psycho0 at mainbus0 addr 0xfffc0000kdb breakpoint at 11d7120

while in bus_compatible, trying to examine bpname (bus_compatible's first
argument) returns a 'symbol not found' error; which i do not understand.

can anyone offer any pointers to help me track down the problem with
determining the root device?  i saw messages in the archive that this
was broken, but none that it had been fixed.