Subject: 1.5.3 fxp scsi fun
To: None <port-i386@netbsd.org>
From: David Brownlee <abs@formula1.com>
List: port-i386
Date: 07/13/2002 21:51:40
	We have two 1.5.3 athlon webservers which seem to be having some
	scsi/ethernet issues. Occasionally the machines just die.
	One has an siop, and the other an ahc. I believe they are both VIA
	chipsets. Does this ring a bell for anyone?

siop/fxp machine:

    cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
    total memory = 1023 MB
    avail memory = 869 MB
    using 8192 buffers containing 128 MB of memory
    BIOS32 rev. 0 found at 0xfb5b0
    mainbus0 (root)
    pci0 at mainbus0 bus 0: configuration mode 1
    pci0: i/o space, memory space enabled
    pchb0 at pci0 dev 0 function 0
    pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
    ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f (rev. 0x00)
    pci1 at ppb0 bus 1
    pci1: i/o space, memory space enabled
    pcib0 at pci0 dev 7 function 0
    pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev. 0x40)
    [...]
    siop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
    siop0: using on-board RAM
    siop0: interrupting at irq 11
    scsibus0 at siop0: 16 targets, 8 luns per target
    fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
    fxp0: interrupting at irq 10
    fxp0: Ethernet address 00:02:b3:9c:3e:ad
    inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
    inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

serial console errors for siop/fxp when dying:

    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0: WARNING: SCB timed out!
    fxp0 at line 2098: dmasync timeout
    fxp0: WARNING: SCB timed out!
    fxp0 at line 1617: dmasync timeout
    sd0(siop0:1:0): command timeout
    sd0(siop0:1:0): command timeout
    sd0(siop0:1:0): command timeout
    sd0(siop0:1:0): command timeout

aha/fxp machine:
    cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
    total memory = 1023 MB
    avail memory = 869 MB
    using 8192 buffers containing 128 MB of memory
    BIOS32 rev. 0 found at 0xfb5b0
    mainbus0 (root)
    pci0 at mainbus0 bus 0: configuration mode 1
    pci0: i/o space, memory space enabled
    pchb0 at pci0 dev 0 function 0
    pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
    ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f (rev. 0x00)
    pci1 at ppb0 bus 1
    pci1: i/o space, memory space enabled
    pcib0 at pci0 dev 7 function 0
    pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev. 0x40)
    [...]
    ahc0 at pci0 dev 15 function 0
    ahc0: interrupting at irq 11
    ahc0: aic7892 Wide Channel A, SCSI Id=7, 16/255 SCBs
    scsibus0 at ahc0 channel 0: 16 targets, 8 luns per target
    fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
    fxp0: interrupting at irq 10
    fxp0: Ethernet address 00:02:b3:9c:8c:34
    inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
    inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

serial console errors for aha/fxp when dying:

    sd1(ahc0:1:0): SCB 17 - timed out while idle, SEQADDR == 0x155
    SCSIRATE == 0x0
    sd1(ahc0:1:0): SCB 17: Immediate reset.  Flags = 0x4040
    sd1(ahc0:1:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
    ahc0: target 0 using 16bit transfers
    ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
    ahc0: target 0 using 16bit transfers
    ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
    ahc0: target 1 using 16bit transfers
    ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
    ahc0: target 1 using 16bit transfers
    ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
    ahc0: Data Parity Error Detected during address or write data phase
    sd0(ahc0:0:0): SCB 19 - timed out in Data-out phase, SEQADDR == 0x5d
    SCSIRATE == 0x93
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): no longer in timeout, status = 0
    sd0(ahc0:0:0): Unexpected busfree in Message-out phase
    SEQADDR == 0x165
    sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
    [...repeated many times...]
    sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
    sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
    ahc0:A:0: unknown scsi bus phase e6.  Attempting to continue
    ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
    SAVED_TCL == 0x0, ARG_1 == 0x19, SEQ_FLAGS == 0x0
    [...last two lines repeated many times...]
    ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
    SCSIRATE == 0x0
    sd0(ahc0:0:0): SCB 19: Immediate reset.  Flags = 0x4050
    sd0(ahc0:0:0): no longer in timeout, status = 2
    ahc0: Issued Channel A Bus Reset. 9 SCBs aborted
    ahc0: target 0 using 16bit transfers
    ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: target 1 using 16bit transfers
    ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): SCB 18 - timed out in Command phase, SEQADDR == 0x165
    SCSIRATE == 0x93
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): SCB 19 - timed out in Command phase, SEQADDR == 0x165
    SCSIRATE == 0x93
    sd0(ahc0:0:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 12 SCBs aborted
    ahc0: target 0 using 16bit transfers
    ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
    ahc0: target 1 using 16bit transfers
    ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    ahc0: Interrupted for status of 0???
    sd1(ahc0:1:0): queue full
    sd0(ahc0:0:0): queue full
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: Interrupted for status of 0???
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    ahc0: WARNING no command for scb 22 (cmdcmplt)
    QOUTPOS = 40
    sd0(ahc0:0:0): queue full
    ahc0: Interrupted for status of 0???
    panic: biodone already
    Begin traceback...
    biodone(c9e56d5c,0,0,8,c194c480) at biodone+0x2d
    scsipi_done(c195c40c,c1456000,c195c40c) at scsipi_done+0x146
    ahc_done(c1456000,c193f460) at ahc_done+0x2e7
    ahc_search_qinfifo(c1456000,0,0,0,ff) at ahc_search_qinfifo+0xfe
    ahc_freeze_devq(c1456000,c194c480,71,0,c1456000) at ahc_freeze_devq+0x2a
    ahc_handle_seqint(c1456000,71) at ahc_handle_seqint+0x514
    ahc_intr(c1456000) at ahc_intr+0x118
    Xintr11() at Xintr11+0x78
    --- interrupt ---
    0x805b730:
    End traceback...
    syncing disks... sd0(ahc0:0:0): SCB 18 - timed out in Message-in phase, SEQADDR == 0xdd
    SCSIRATE == 0x93
    sd0(ahc0:0:0): BDR message in message buffer
    ahc0:A:0: unknown scsi bus phase b6.  Attempting to continue
    sd0(ahc0:0:0): SCB 19 - timed out while idle, SEQADDR == 0x38
    SCSIRATE == 0x0
    sd0(ahc0:0:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 6 SCBs aborted
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 10 SCBs aborted
    sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd1(ahc0:1:0): Other SCB Timeout
    sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 11 SCBs aborted
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): no longer in timeout, status = 0
    ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
    sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd1(ahc0:1:0): Other SCB Timeout
    sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0(ahc0:0:0): BDR message in message buffer
    sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
    SCSIRATE == 0x0
    sd0: dk_busy < 0
    panic: disk_unbusy
    Begin traceback...
    disk_unbusy(c1950a2c,0,5,c9efbc3c,ed036ac8) at disk_unbusy+0x31
    sddone(c195c128) at sddone+0x4f
    scsipi_done(c195c128,c1456000,c195c128) at scsipi_done+0xfd
    ahc_done(c1456000,c193f410) at ahc_done+0x2e7
    ahc_search_qinfifo(c1456000,ffffffff,41,ffffffff,ff) at ahc_search_qinfifo+0xfe
    ahc_abort_scbs(c1456000,ffffffff,41,ffffffff,ff) at ahc_abort_scbs+0x60
    ahc_reset_channel(c1456000,41,1,7fffffff,c193f3e8) at ahc_reset_channel+0x2c5
    ahc_timeout(c193f3e8) at ahc_timeout+0x284
    softclock(c1955d4c,0,ffffffff,c0198328,ecd49b88) at softclock+0x121
    Xsoftclock() at Xsoftclock+0xf
    --- interrupt ---
    param.c(ef06a45c,ecd49b88,10,40e,308c9d) at  0x3144b80
    End traceback...

    dumping to dev 4,1 offset 23855
    dump device bad


-- 
	    David/absolute		abs@formula1.com