Subject: Re: isp driver issues
To: NetBSD-current Discussion List <current-users@NetBSD.ORG>
From: Andreas Wrede <andreas@planix.com>
List: current-users
Date: 11/12/2004 22:21:30
--Apple-Mail-6--397677116
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed


On 12-Nov-04, at 1:46 AM, Thomas T. Thai wrote:

> Andreas Wrede wrote:
>
>> If I detach both fibre channels, tha machine boots fine:
>
> ...
>
>> I then can connect the XServe RAID controllers and run
>>
>> # scsictl /dev/scsibus0 scan any any
>> sd2 at scsibus0 target 0 lun 0: <APPLE, Xserve RAID, 1.21> disk fixed
>> sd2: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x   
>> 2171240448 sectors
>
> ...
>
>> sd2 and sd3 now work as expected - disklabel, newfs, mount and   
>> read/write without further error messages from the controller.
>>
> The above procedure was confirmed on my system as well.

I am making some progress.  First, testing with a very -current kernel  
showed no difference, so I continued investigation using a 2.0_RC5  
kernel.

Rearranging the controllers so that there was no IRQ conflict between  
the esiop(4) controller and the first isp(4) allows the boot to  
proceed. The 'probe(esiop0:0:0:0): command timeout...' does not happen  
and drives attached to esiop(4) are probed correctly. I have no idea  
why the two controllers (isp(4) and esiop(4) interact so badly during  
probe time.

On both isp(4) controllers however, no devices are found during  
probing, even though the XServe RAID box is attached, up and running.  
Once the boot completes, 'scsictl /dev/scsibus[12] scan any any' will  
find the devices without a problem.

So I build a kernel with 'options     SCSI_DELAY=10', as delaying the  
probe appeared to work.
I found that the scsi probes proceed with virtually no delay, looks  
like the 10 second delay is either being ignored or I misunderstand  
something:


[Fri Nov 12 21:57:57 2004]: scsibus0: waiting 10 seconds for devices to  
settle...
[Fri Nov 12 21:57:57 2004]: scsibus1: waiting 10 seconds for devices to  
settle...
[Fri Nov 12 21:57:57 2004]: scsibus2: waiting 10 seconds for devices to  
settle...
[Fri Nov 12 21:57:57 2004]: atapibus0 at atabus0: 2 targets
[Fri Nov 12 21:57:58 2004]: cd0 at atapibus0 drive 0: <CD-224E, , 1.5A>  
cdrom removable
[Fri Nov 12 21:57:58 2004]: cd0: 32-bit data port
[Fri Nov 12 21:57:58 2004]: cd0: drive supports PIO mode 4, DMA mode 2
[Fri Nov 12 21:57:58 2004]: cd0(piixide0:0:0): using PIO mode 4, DMA  
mode 2 (using DMA data transfers)
[Fri Nov 12 21:57:58 2004]: isp0: Interrupting Mailbox Command (0x69)  
Timeout
[Fri Nov 12 21:58:02 2004]: isp0: Mailbox Command 'GET FW STATE' failed  
(TIMEOUT)
[Fri Nov 12 21:58:02 2004]: isp1: Interrupting Mailbox Command (0x69)  
Timeout
[Fri Nov 12 21:58:02 2004]: isp1: Mailbox Command 'GET FW STATE' failed  
(TIMEOUT)
[Fri Nov 12 21:58:02 2004]: esiop0: alloc newcdb at PHY addr 0x183e000
[Fri Nov 12 21:58:07 2004]: sd0 at scsibus0 target 0 lun 0: <HP, 9.10GB  
A 80-F309, > disk fixed
[Fri Nov 12 21:58:07 2004]: sd0: 8678 MB, 9827 cyl, 5 head, 361 sec,  
512 bytes/sect x 17773524 sectors
[Fri Nov 12 21:58:07 2004]: sd0: sync (25.00ns offset 31), 16-bit  
(80.000MB/s) transfers, tagged queueing
[Fri Nov 12 21:58:07 2004]: sd1 at scsibus0 target 1 lun 0: <HP, 9.10GB  
A 80-F309, > disk fixed
[Fri Nov 12 21:58:07 2004]: sd1: 8678 MB, 9827 cyl, 5 head, 361 sec,  
512 bytes/sect x 17773524 sectors
[Fri Nov 12 21:58:07 2004]: sd1: sync (25.00ns offset 31), 16-bit  
(80.000MB/s) transfers, tagged queueing
[Fri Nov 12 21:58:07 2004]: isp0: Interrupting Mailbox Command (0x69)  
Timeout
[Fri Nov 12 21:58:07 2004]: isp0: Mailbox Command 'GET FW STATE' failed  
(TIMEOUT)
[Fri Nov 12 21:58:07 2004]: isp1: Interrupting Mailbox Command (0x69)  
Timeout
[Fri Nov 12 21:58:08 2004]: isp1: Mailbox Command 'GET FW STATE' failed  
(TIMEOUT)
[Fri Nov 12 21:58:08 2004]: Searching for RAID components...


So, I suspect that the probe on both isp(4) controllers will succeed,  
if either the timeout can be increased or the probe can be delayed for  
longer after the fibre channel is brought up.


Again, for completeness, the full boot log:

NetBSD 2.0_RC5 (PLANIX.MP) #3: Fri Nov 12 21:53:18 EST 2004
          
root@willy.wrede.pvt:/u1/netbsd-2.0/obj/sys/arch/i386/compile.i386/ 
PLANIX.MP
total memory = 255 MB
avail memory = 242 MB
BIOS32 rev. 0 found at 0xfd83c
mainbus0 (root)
known mode 1 PCI chipset (71928086)
mainbus0: Intel MP Specification (Version 1.4) (HP       LPr         )
cpu0 at mainbus0: apid 1 (boot processor)
cpu0: Intel Pentium III (686-class), 698.85 MHz, id 0x681
cpu0: features 387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 387fbff<PGE,MCA,CMOV,PAT,PSE36,PN,MMX>
cpu0: features 387fbff<FXSR,SSE>
cpu0: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu0: L2 cache 256 KB 32B/line 8-way
cpu0: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu0: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu0: serial number 0000-0681-0003-9C42-8463-DD55
cpu0: calibrating local timer
cpu0: apic clock running at 99 MHz
cpu0: 8 page colors
cpu1 at mainbus0: apid 0 (application processor)
cpu1: starting
cpu1: Intel Pentium III (686-class), 698.81 MHz, id 0x681
cpu1: features 387fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu1: features 387fbff<PGE,MCA,CMOV,PAT,PSE36,PN,MMX>
cpu1: features 387fbff<FXSR,SSE>
cpu1: I-cache 16 KB 32B/line 4-way, D-cache 16 KB 32B/line 4-way
cpu1: L2 cache 256 KB 32B/line 8-way
cpu1: ITLB 32 4 KB entries 4-way, 2 4 MB entries fully associative
cpu1: DTLB 64 4 KB entries 4-way, 8 4 MB entries 4-way
cpu1: serial number 0000-0681-0000-B7E3-AF6D-2265
mpbios: bus 0 is type PCI
mpbios: bus 1 is type PCI
mpbios: bus 2 is type ISA
ioapic0 at mainbus0 apid 2 (I/O APIC)
ioapic0: pa 0xfec00000, version 11, 24 pins
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0
pchb0: Intel 82443BX Host Bridge/Controller (AGP disabled) (rev. 0x03)
pchb0: fixing Idle/Pipeline DRAM Leadoff Timing
pcib0 at pci0 dev 4 function 0
pcib0: Intel 82371AB PCI-to-ISA Bridge (PIIX4) (rev. 0x02)
piixide0 at pci0 dev 4 function 1
piixide0: Intel 82371AB IDE controller (PIIX4) (rev. 0x01)
piixide0: bus-master DMA support present
piixide0: primary channel wired to compatibility mode
piixide0: primary channel interrupting at ioapic0 pin 14 (irq 14)
atabus0 at piixide0 channel 0
piixide0: secondary channel wired to compatibility mode
piixide0: secondary channel ignored (disabled)
uhci0 at pci0 dev 4 function 2: Intel 82371AB USB Host Controller  
(PIIX4) (rev. 0x01)
uhci0: interrupting at ioapic0 pin 19 (irq 11)
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Intel 82371AB Power Management Controller (PIIX4) (miscellaneous  
bridge, revision 0x02) at pci0 dev 4 function 3 not configured
ppb0 at pci0 dev 7 function 0: Digital Equipment DECchip 21152 PCI-PCI  
Bridge (rev. 0x03)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
skc0 at pci1 dev 3 function 0: ioapic0 pin 19 (irq 11)
skc0: bad VPD resource id: expected 82 got ff
skc0: (null) rev. (0x1)
sk0 at skc0 port A: Ethernet address 00:0f:3d:87:e2:79
makphy0 at sk0 phy 0: Marvell 88E1011 Gigabit PHY, rev. 3
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,  
1000baseT-FDX, auto
esiop0 at pci1 dev 4 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
esiop0: using on-board RAM
esiop0: interrupting at ioapic0 pin 18 (irq 15)
esiop0: alloc new tag DSA table at PHY addr 0x16ea000
scsibus0 at esiop0: 16 targets, 8 luns per target
isp0 at pci0 dev 8 function 0: QLogic FC-AL and Fabric HBA
isp0: interrupting at ioapic0 pin 16 (irq 10)
scsibus1 at isp0: 256 targets, 8 luns per target
isp1 at pci0 dev 9 function 0: QLogic FC-AL and Fabric HBA
isp1: interrupting at ioapic0 pin 17 (irq 5)
scsibus2 at isp1: 256 targets, 8 luns per target
vga1 at pci0 dev 13 function 0: Cirrus Logic CL-GD5446 (rev. 0x45)
wsdisplay0 at vga1 kbdmux 1
wsmux1: connecting to wsdisplay0
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com0: console
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbdprobe: reset error 5
pmsprobe: reset error 5
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
ioapic0: enabling
fd0 at fdc0 drive 0: 1.44MB, 80 cyl, 2 head, 18 sec
raidattach: Asked for 8 units
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
scsibus0: waiting 10 seconds for devices to settle...
scsibus1: waiting 10 seconds for devices to settle...
scsibus2: waiting 10 seconds for devices to settle...
atapibus0 at atabus0: 2 targets
cd0 at atapibus0 drive 0: <CD-224E, , 1.5A> cdrom removable
cd0: 32-bit data port
cd0: drive supports PIO mode 4, DMA mode 2
cd0(piixide0:0:0): using PIO mode 4, DMA mode 2 (using DMA data  
transfers)
isp0: Interrupting Mailbox Command (0x69) Timeout
isp0: Mailbox Command 'GET FW STATE' failed (TIMEOUT)
isp1: Interrupting Mailbox Command (0x69) Timeout
isp1: Mailbox Command 'GET FW STATE' failed (TIMEOUT)
esiop0: alloc newcdb at PHY addr 0x183e000
sd0 at scsibus0 target 0 lun 0: <HP, 9.10GB A 80-F309, > disk fixed
sd0: 8678 MB, 9827 cyl, 5 head, 361 sec, 512 bytes/sect x 17773524  
sectors
sd0: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged  
queueing
sd1 at scsibus0 target 1 lun 0: <HP, 9.10GB A 80-F309, > disk fixed
sd1: 8678 MB, 9827 cyl, 5 head, 361 sec, 512 bytes/sect x 17773524  
sectors
sd1: sync (25.00ns offset 31), 16-bit (80.000MB/s) transfers, tagged  
queueing
isp0: Interrupting Mailbox Command (0x69) Timeout
isp0: Mailbox Command 'GET FW STATE' failed (TIMEOUT)
isp1: Interrupting Mailbox Command (0x69) Timeout
isp1: Mailbox Command 'GET FW STATE' failed (TIMEOUT)
Searching for RAID components...
warning: double match for boot device (sd0, sd1)
boot device: sd0
root on sd0a dumps on sd0b
mountroot: trying msdos...
mountroot: trying cd9660...
mountroot: trying nfs...
mountroot: trying lfs...
mountroot: trying ext2fs...
mountroot: trying ffs...
root file system type: ffs
cpu1: CPU 0 running
isp0: Mbox Command Async (0x4000) with no waiters
isp1: Mbox Command Async (0x4000) with no waiters
init: copying out path `/sbin/init' 11
Fri Nov 12 21:58:11 EST 2004
swapctl: adding /dev/sd0b as swap device at priority 0
Checking for botched superblock upgrades: done.
Starting file system checks:
/dev/rsd0a: file system is clean; not checking
Can't open /dev/rsd2a: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd2a: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
Can't open /dev/rsd3a: Device not configured
CAN'T CHECK FILE SYSTEM.
/dev/rsd3a: UNEXPECTED INCONSISTENCY; RUN fsck_ffs MANUALLY.
/dev/rsd0e: file system is clean; not checking
/dev/rsd0f: file system is clean; not checking
THE FOLLOWING FILE SYSTEMS HAD AN UNEXPECTED INCONSISTENCY:
         ffs: /dev/rsd2a (/u5), ffs: /dev/rsd3a (/b1)
Automatic file system check failed; help!
Nov 12 21:58:12 init: /bin/sh on /etc/rc terminated abnormally, going  
to single user mode
Enter pathname of shell or RETURN for /bin/sh: /bin/ksh
# scsictl /dev/scsibus1 scan any any
sd2 at scsibus1 target 0 lun 0: <APPLE, Xserve RAID, 1.21> disk fixed
sd2: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x  
2171240448 sectors
# scsictl /dev/scsibus2 scan any any
sd3 at scsibus2 target 0 lun 0: <APPLE, Xserve RAID, 1.21> disk fixed
sd3: 1035 GB, 132522 cyl, 128 head, 128 sec, 512 bytes/sect x  
2171240448 sectors
#^D Fri Nov 12 21:59:36 EST 2004

-- 
	aew

--Apple-Mail-6--397677116
content-type: application/pgp-signature; x-mac-type=70674453;
	name=PGP.sig
content-description: This is a digitally signed message part
content-disposition: inline; filename=PGP.sig
content-transfer-encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFBlX26Eh/h9J/TQyERAuJFAKDAdnThczmpICI/0HAws0fBgumEZQCfXuqh
q7/9FoDn63ThMR6zkJ14Wkg=
=WfjJ
-----END PGP SIGNATURE-----

--Apple-Mail-6--397677116--