Subject: kern/29369: ahc driver cannot handle HP LTO 2-SCSI drive
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <Frank.Kardel@acrys.com>
List: netbsd-bugs
Date: 02/14/2005 13:13:00
>Number:         29369
>Category:       kern
>Synopsis:       ahc driver cannot handle HP LTO 2-SCSI drive - SCB timeout / BDR -> hang/unusable -> REBOOT needed
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Feb 14 13:13:00 +0000 2005
>Originator:     kardel@acrys.com
>Release:        NetBSD 2.0G
>Organization:
	Acrys Consult GmbH & Co. KG
>Environment:
System: NetBSD Orcus 2.0G NetBSD 2.0G (ORCUS32) #0: Sun Sep 12 20:28:21 CEST 2004 kardel@Orcus:/usr/src/sys/arch/i386/compile/ORCUS32 i386
Architecture: i386
Machine: i386
>Description:
	The Overland Tapelibrary uses an HP LTO-2 SCSI drive. Testing the drive functionality
	with bacula btape facility leads reliably to a
	SCB timeout in the ahc driver resulting in a card state dump
	and a hanging st device (see also PR 21335 - same symptoms but probably different cause there).
	Last messages from the btape test program where:

=== Append files test ===

This test is essential to Bacula.

I'm going to write one record  in file 0,
                   two records in file 1,
             and three records in file 2

btape: btape.c:435 Rewound /dev/enrst1
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:465 Wrote 1 EOF to /dev/enrst1
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:465 Wrote 1 EOF to /dev/enrst1
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:1505 Wrote one record of 64412 bytes.
btape: btape.c:1507 Wrote block to device.
btape: btape.c:465 Wrote 1 EOF to /dev/enrst1

Message from syslogd@Orcus at Mon Feb 14 12:43:37 2005 ...
Orcus /netbsd: <<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>

	The dmesg.boot output is:

NetBSD 2.0G (ORCUS32) #0: Sun Sep 12 20:28:21 CEST 2004
	kardel@Orcus:/usr/src/sys/arch/i386/compile/ORCUS32
total memory = 2047 MB
avail memory = 1996 MB
BIOS32 rev. 0 found at 0xf0010
mainbus0 (root)
cpu0 at mainbus0: (uniprocessor)
cpu0: AMD Unknown K7 (Athlon) (686-class), 2004.61 MHz, id 0xf58
cpu0: features 78bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR>
cpu0: features 78bfbff<PGE,MCA,CMOV,PAT,PSE36,MPC,MMX>
cpu0: features 78bfbff<FXSR,SSE,SSE2>
cpu0: "AMD Opteron(tm) Processor 146"
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
ppb0 at pci0 dev 6 function 0: Advanced Micro Devices AMD8111 I/O Hub (rev. 0x07)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
ohci0 at pci1 dev 0 function 0: Advanced Micro Devices AMD8111 USB Host Controller (rev. 0x0b)
ohci0: interrupting at irq 9
ohci0: OHCI version 1.0, legacy support
usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: Advanced Micro OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 3 ports with 3 removable, self powered
ohci1 at pci1 dev 0 function 1: Advanced Micro Devices AMD8111 USB Host Controller (rev. 0x0b)
ohci1: interrupting at irq 9
ohci1: OHCI version 1.0, legacy support
usb1 at ohci1: USB revision 1.0
uhub1 at usb1
uhub1: Advanced Micro OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
ahc1 at pci1 dev 3 function 0: Adaptec 29160 Ultra160 SCSI adapter
ahc1: interrupting at irq 5
ahc1: aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsibus0 at ahc1: 16 targets, 8 luns per target
ahc2 at pci1 dev 6 function 0: Adaptec 29160 Ultra160 SCSI adapter
ahc2: interrupting at irq 10
ahc2: aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
scsibus1 at ahc2: 16 targets, 8 luns per target
ex0 at pci1 dev 7 function 0: 3Com 3c905C-TX 10/100 Ethernet with mngmt (rev. 0x78)
ex0: interrupting at irq 11
ex0: MAC address 00:0a:5e:06:2c:62
exphy0 at ex0 phy 24: 3Com internal media interface
exphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
pdcide0 at pci1 dev 9 function 0
pdcide0: Promise Ultra133/ATA Bus Master IDE Accelerator (rev. 0x02)
pdcide0: bus-master DMA support present
pdcide0: primary channel configured to native-PCI mode
pdcide0: using irq 5 for native-PCI interrupt
atabus0 at pdcide0 channel 0
pdcide0: secondary channel configured to native-PCI mode
atabus1 at pdcide0 channel 1
pdcide1 at pci1 dev 10 function 0
pdcide1: Promise Ultra133/ATA Bus Master IDE Accelerator (rev. 0x02)
pdcide1: bus-master DMA support present
pdcide1: primary channel configured to native-PCI mode
pdcide1: using irq 10 for native-PCI interrupt
atabus2 at pdcide1 channel 0
pdcide1: secondary channel configured to native-PCI mode
atabus3 at pdcide1 channel 1
vga1 at pci1 dev 11 function 0: ATI Technologies Rage XL (rev. 0x27)
wsdisplay0 at vga1 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
satalink0 at pci1 dev 12 function 0
satalink0: Silicon Image SATALink 3114 (rev. 0x02)
satalink0: 33MHz PCI bus
satalink0: bus-master DMA support present
satalink0: using irq 10 for native-PCI interrupt
atabus4 at satalink0 channel 0
atabus5 at satalink0 channel 1
atabus6 at satalink0 channel 2
atabus7 at satalink0 channel 3
bge0 at pci1 dev 13 function 0: Broadcom BCM5705 Gigabit Ethernet
bge0: interrupting at irq 9
bge0: ASIC BCM5705 A3 (0x3003), Ethernet address 00:e0:81:60:3b:11
brgphy0 at bge0 phy 1: BCM5705 1000BASE-T media interface, rev. 2
brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
bge1 at pci1 dev 14 function 0: Broadcom BCM5705 Gigabit Ethernet
bge1: interrupting at irq 5
bge1: ASIC BCM5705 A3 (0x3003), Ethernet address 00:e0:81:60:3b:12
brgphy1 at bge1 phy 1: BCM5705 1000BASE-T media interface, rev. 2
brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto
pcib0 at pci0 dev 7 function 0
pcib0: Advanced Micro Devices AMD8111 LPC Controller (rev. 0x05)
viaide0 at pci0 dev 7 function 1
viaide0: Advanced Micro Devices AMD8111 IDE Controller (rev. 0x03)
viaide0: bus-master DMA support present
viaide0: primary channel configured to compatibility mode
viaide0: primary channel interrupting at irq 14
atabus8 at viaide0 channel 0
viaide0: secondary channel configured to compatibility mode
viaide0: secondary channel interrupting at irq 15
atabus9 at viaide0 channel 1
Advanced Micro Devices AMD8111 SMBus Controller (SMBus serial bus, revision 0x02) at pci0 dev 7 function 2 not configured
Advanced Micro Devices AMD8111 ACPI Controller (miscellaneous bridge, revision 0x05) at pci0 dev 7 function 3 not configured
pchb0 at pci0 dev 24 function 0
pchb0: Advanced Micro Devices AMD64 HyperTransport configuration (rev. 0x00)
pchb1 at pci0 dev 24 function 1
pchb1: Advanced Micro Devices AMD64 Address Map configuration (rev. 0x00)
pchb2 at pci0 dev 24 function 2
pchb2: Advanced Micro Devices AMD64 DRAM configuration (rev. 0x00)
pchb3 at pci0 dev 24 function 3
pchb3: Advanced Micro Devices AMD64 Miscellaneous configuration (rev. 0x00)
isa0 at pcib0
lpt0 at isa0 port 0x378-0x37b irq 7
com0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, working fifo
com1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, working fifo
pckbc0 at isa0 port 0x60-0x64
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pms0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pms0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
sysbeep0 at pcppi0
isapnp0 at isa0 port 0x279: ISA Plug 'n Play device support
npx0 at isa0 port 0xf0-0xff: using exception 16
fdc0 at isa0 port 0x3f0-0x3f7 irq 6 drq 2
isapnp0: no ISA Plug 'n Play devices found
Kernelized RAIDframe activated
IPsec: Initialized Security Association Processing.
scsibus0: waiting 2 seconds for devices to settle...
scsibus1: waiting 2 seconds for devices to settle...
st0 at scsibus0 target 5 lun 0: <HP, C5713A, H910> tape removable
st0: density code 38, variable blocks, write-enabled
ch0 at scsibus0 target 5 lun 1: <HP, C5713A, H910> changer removable
ch0: 6 slots, 1 drive, 0 pickers, 0 portals
st0: sync (50.00ns offset 32), 16-bit (40.000MB/s) transfers
ch0: sync (50.00ns offset 32), 16-bit (40.000MB/s) transfers
wd0 at atabus0 drive 0: <HDS722525VLAT80>
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(pdcide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd1 at atabus1 drive 0: <HDS722525VLAT80>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd1(pdcide0:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd2 at atabus2 drive 0: <HDS722525VLAT80>
wd2: drive supports 16-sector PIO transfers, LBA48 addressing
wd2: 186 GB, 387621 cyl, 16 head, 63 sec, 512 bytes/sect x 390721968 sectors
wd2: 32-bit data port
wd2: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd2(pdcide1:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd3 at atabus3 drive 0: <HDS722525VLAT80>
wd3: drive supports 16-sector PIO transfers, LBA48 addressing
wd3: 186 GB, 387621 cyl, 16 head, 63 sec, 512 bytes/sect x 390721968 sectors
wd3: 32-bit data port
wd3: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd3(pdcide1:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd4 at atabus8 drive 0: <HDS722525VLAT80>
wd4: drive supports 16-sector PIO transfers, LBA48 addressing
wd4: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd4: 32-bit data port
wd4: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd4(viaide0:0:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
wd5 at atabus9 drive 0: <HDS722525VLAT80>
wd5: drive supports 16-sector PIO transfers, LBA48 addressing
wd5: 232 GB, 484521 cyl, 16 head, 63 sec, 512 bytes/sect x 488397168 sectors
wd5: 32-bit data port
wd5: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd5(viaide0:1:0): using PIO mode 4, Ultra-DMA mode 5 (Ultra/100) (using DMA data transfers)
st1 at scsibus1 target 4 lun 0: <HP, Ultrium 2-SCSI, F5AH> tape removable
st1: density code 66, variable blocks, write-enabled
st1: sync (12.50ns offset 127), 16-bit (160.000MB/s) transfers
ch1 at scsibus1 target 6 lun 0: <OVERLAND, LXB, 0106> changer removable
ch1: 16 slots, 1 drive, 1 picker, 0 portals
ch1: sync (100.00ns offset 15), 16-bit (20.000MB/s) transfers
boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs

	Crash output is:

ahc2:SCB 0xe - timed out
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
ahc2: Dumping Card State while idle, at SEQADDR 0x9
Card was paused
ACCUM = 0x4, SINDEX = 0x47, DINDEX = 0x25, ARG_2 = 0x3
HCNT = 0x0 SCBPTR = 0x0
SCSIPHASE[0x0] SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] 
LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0xa] SCSIRATE[0x0] 
SEQCTL[0x10] SEQ_FLAGS[0xc0] SSTAT0[0x0] SSTAT1[0x8] 
SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x8] SIMODE1[0xa4] 
SXFRCTL0[0x80] DFCNTRL[0x0] DFSTATUS[0x89] 
STACK: 0x0 0x16b 0x180 0x3
SCB count = 16
Kernel NEXTQSCB = 15
Card NEXTQSCB = 15
QINFIFO entries: 
Waiting Queue entries: 
Disconnected Queue entries: 0:14 
QOUTFIFO entries: 
Sequencer Free SCB List: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
Sequencer SCB Info: 
  0 SCB_CONTROL[0x44] 
SCB_SCSIID[0x47] SCB_LUN[0x0] SCB_TAG[0xe] 
  1 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  2 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  3 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  4 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  5 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  6 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  7 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  8 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
  9 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 10 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 11 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 12 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 13 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 14 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 15 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 16 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 17 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 18 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 19 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 20 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 21 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 22 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 23 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 24 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 25 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 26 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 27 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 28 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 29 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 30 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
 31 SCB_CONTROL[0x0] 
SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 
Pending list: 
 14 SCB_CONTROL[0x40] 
SCB_SCSIID[0x47] SCB_LUN[0x0] 
Kernel Free SCB list: 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
Untagged Q(4): 14 

>>>>>>>>>>>>>>>>>
ahc2:Queuing a BDR SCB
ahc2:Bus Device Reset Message Sent
st1(ahc2:0:4:0): ahc2: no longer in timeout, status = 0
ahc2: Bus Device Reset on A:4. 1 SCBs aborted

	Tape device st1 hangs from here on.

	This renders the tape drive currently unusable.

>How-To-Repeat:
	Buy a brand new tape library.
	Try to use thr HP LTO 2-SCSI drive
	Discover the something goes amiss...

>Fix:
	Return Tape library to merchant ?
	Find quirks entries for HP LTO-2 drives ?
	Timeouts used for writing FM's don't seem to be too unresonable...
	Any debug option one could set ?