Subject: more ESP bugs and OFW ramblings....
To: NetBSD/sparc Discussion List <port-sparc@NetBSD.ORG>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 10/18/2002 20:45:29
So I finally got a second carrier for another disk in the SS5 that's to
become my new router and I thought I'd try an experiment with inserting
it into the live system.  Well that just didn't work very well at all,
despite the fact that I didn't even get a chance to probe it or
anything.  Suddenly the existing disk became very slow and in some cases
inaccessible, but there were no error messages on the console.  I was
able to login via rlogin, and on the console, but couldn't access the
binaries for some commands (including scsictl, shutdown, and halt), nor
could I even "ls -l" some files (including some /dev files, but not all)

Unfortunately there's currently no way to reset the internal SCSI bus in
a sparcstation other than it seems by power-cycling it.  Dropping down
to the OpenBoot firmware and running "probe-scsi" didn't help.  However
it did reveal that the new disk had become the only visible one:

	telnet> send brk
	Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
	db> machine prom
	Type  'go' to resume
	Type  help  for more information
	ok probe-scsi
	Target 1 
	  Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
	                    Copyright (c) 1996 Seagate
	                    All rights reserved 0000
	ok

The original target ID#3 Connor seems to have become invisible.

DDB's "reboot" command paniced because it couldn't write to the disk.

This kind of screws up my plan to show that an older sparcstation can be
made into a very reliable little server with hot-swappable mirrored
drives.  I'm hoping I can help fix and enhance the esp driver so that
this goal is eventually achievable.  I.e. I'm hoping that with the right
driver hooks the bus really can be reset properly on a live system.


Even an OFW "reset" command didn't seem to reset the bus, but then the
new disk, the SEAGATE, showed up!

	Resetting ... 
	SPARCstation 5, No Keyboard
	ROM Rev. 2.15 Pilot, 32 MB memory installed, Serial #3540954.
	Ethernet address 8:0:20:21:99:db, Host ID: 803607da.
	
	
	
	Rebooting with command:                                               
	Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@3,0  File and args: 
	Type  help  for more information
	ok probe-scsi
	Target 1 
	  Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
	                    Copyright (c) 1996 Seagate
	                    All rights reserved 0000
	ok reset
	Resetting ... 
	SPARCstation 5, No Keyboard
	ROM Rev. 2.15 Pilot, 32 MB memory installed, Serial #3540954.
	Ethernet address 8:0:20:21:99:db, Host ID: 803607da.
	
	
	
	                                                                      
	Type  help  for more information
	ok probe-scsi
	Target 1 
	  Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
	                    Copyright (c) 1996 Seagate
	                    All rights reserved 0000
	ok 

Unfortunately I didn't think soon enough to try "cd /iommu/sbus/dma/esp"
and then execute that node's "reset" word (and I'm not really sure I'd
know exactly how to do this for real).

Only after I power-cycled the box did the new disk probe properly (and
of course then I got bit by the stupid target ID#3 is the bottom slot
and had been sd0 and the new disk as target ID#1 became sd0, but was
unbootable....)

	ok probe-scsi
	Target 1 
	  Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
	                    Copyright (c) 1996 Seagate
	                    All rights reserved 0000
	Target 3 
	  Unit 0   Disk     CONNER  CFP1080E SUN1.0546496BDB
	ok 

I flipped those drives over so that the GENERIC kernel would still work! :-)

	ok probe-scsi
	Target 1 
	  Unit 0   Disk     CONNER  CFP1080E SUN1.0546496BDB
	Target 3 
	  Unit 0   Disk     SEAGATE ST31200W SUN1.05946200427549
	                    Copyright (c) 1996 Seagate
	                    All rights reserved 0000
	ok 


I'm also still a little confused by the default device aliases in the
2.x PROM.  "boot disk" now fails and I have to say "boot disk1".  This
part does makes sense since of course "disk" should be the same as
"disk3", right?

	ok boot disk  
	Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@3,0  File and args: 
	Bad magic number in disk label
	Can't open disk label package
	
	Can't open boot device
	
	ok boot disk1
	Boot device: /iommu/sbus/espdma@5,8400000/esp@5,8800000/sd@1,0  File and args: 
	>> NetBSD/sparc Secondary Boot, Revision 1.9


However this part doesn't make sense -- there are no aliases!

	telnet> send brk
	Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
	db> machine prom
	Type  'go' to resume
	ok devalias
	ok devalias disk
	disk ?
	ok .version
	Release 2.15 Pilot Version 0 created 93/12/21 16:00:45
	ok 

Using the phantom aliases with "cd" doesn't seem to work:

	ok cd disk3
	ok pwd
	/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
	ok cd /
	ok cd disk 
	ok pwd
	/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
	ok cd /
	ok cd disk1
	ok pwd
	/iommu@0,10000000/sbus@0,10001000/espdma@5,8400000/esp@5,8800000/sd
	ok cd /
	ok device-end

Last time I got this deep into OFW stuff it was with a 3.x PROM on a
SunFire VF150, and there the devalias did show what I expected (note
that on that machine I'd modified the alias for "disk"):

	ok devalias
	disk                     /pci@1f,0/ide@d/disk@2,0
	rtc                      /pci@1f,0/isa@7/rtc@0,70
	usb                      /pci@1f,0/usb@a
	flash                    /pci@1f,0/isa@7/flashprom@1f,0
	lom                      /pci@1f,0/isa@7/SUNW,lomh@0,8010
	i2c-nvram                /pci@1f,0/pmu@3/i2c@0,0/i2c-nvram@0,aa
	net1                     /pci@1f,0/ethernet@5
	dload1                   /pci@1f,0/ethernet@5:,
	dload                    /pci@1f,0/ethernet@c:,
	net0                     /pci@1f,0/ethernet@c
	net                      /pci@1f,0/ethernet@c
	cdrom                    /pci@1f,0/ide@d/cdrom@3,0:f
	disk3                    /pci@1f,0/ide@d/disk@3,0
	disk2                    /pci@1f,0/ide@d/disk@2,0
	disk1                    /pci@1f,0/ide@d/disk@1,0
	disk0                    /pci@1f,0/ide@d/disk@0,0
	ide                      /pci@1f,0/ide@d
	floppy                   /pci@1f,0/isa@7/dma/floppy
	ttyb                     /pci@1f,0/isa@7/serial@0,2e8
	ttya                     /pci@1f,0/isa@7/serial@0,3f8
	ok 

It also works as expected on an Axil 325 (ss20 clone) I have here, and
it claims to have an only slightly newer OFW version:

	login: [halt sent]
	Stopped at      cpu_Debugger+0x4:       jmpl            [%o7 + 0x8], %g0
	db> machine prom
	Type  'go' to resume
	ok .version
	Release 2.19 Version 106 created 95/04/11 16:57:02
	ok devalias
	ttyb           /obio/zs@0,100000:b
	ttya           /obio/zs@0,100000:a
	keyboard!      /obio/zs@0,0:forcemode
	keyboard       /obio/zs@0,0
	floppy         /obio/SUNW,fdtwo
	scsi           /iommu/sbus/espdma@f,400000/esp@f,800000
	net-aui        /iommu/sbus/ledma@f,400010:aui/le@f,c00000
	net-tpe        /iommu/sbus/ledma@f,400010:tpe/le@f,c00000
	net            /iommu/sbus/ledma@f,400010/le@f,c00000
	disk           /iommu/sbus/espdma@f,400000/esp@f,800000/sd@3,0
	cdrom          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@6,0:d
	tape           /iommu/sbus/espdma@f,400000/esp@f,800000/st@4,0
	tape0          /iommu/sbus/espdma@f,400000/esp@f,800000/st@4,0
	tape1          /iommu/sbus/espdma@f,400000/esp@f,800000/st@5,0
	disk3          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@0,0
	disk2          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@2,0
	disk1          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@1,0
	disk0          /iommu/sbus/espdma@f,400000/esp@f,800000/sd@3,0
	ok go
	db> cont

The machine I'm currently typing on is a real Sun SS20, also with ROM
release 2.15, and "devalias" works on it just fine too.

Does anyone know why I don't see something equivalent on my "new" SS5?

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>