Subject: 1.5.3 fxp scsi fun
To: None <port-i386@netbsd.org>
From: David Brownlee <abs@formula1.com>
List: port-i386
Date: 07/13/2002 21:51:40
We have two 1.5.3 athlon webservers which seem to be having some
scsi/ethernet issues. Occasionally the machines just die.
One has an siop, and the other an ahc. I believe they are both VIA
chipsets. Does this ring a bell for anyone?
siop/fxp machine:
cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
total memory = 1023 MB
avail memory = 869 MB
using 8192 buffers containing 128 MB of memory
BIOS32 rev. 0 found at 0xfb5b0
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
pcib0 at pci0 dev 7 function 0
pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev. 0x40)
[...]
siop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide scsi)
siop0: using on-board RAM
siop0: interrupting at irq 11
scsibus0 at siop0: 16 targets, 8 luns per target
fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
fxp0: interrupting at irq 10
fxp0: Ethernet address 00:02:b3:9c:3e:ad
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
serial console errors for siop/fxp when dying:
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0: WARNING: SCB timed out!
fxp0 at line 2098: dmasync timeout
fxp0: WARNING: SCB timed out!
fxp0 at line 1617: dmasync timeout
sd0(siop0:1:0): command timeout
sd0(siop0:1:0): command timeout
sd0(siop0:1:0): command timeout
sd0(siop0:1:0): command timeout
aha/fxp machine:
cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
total memory = 1023 MB
avail memory = 869 MB
using 8192 buffers containing 128 MB of memory
BIOS32 rev. 0 found at 0xfb5b0
mainbus0 (root)
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled
pchb0 at pci0 dev 0 function 0
pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f (rev. 0x00)
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled
pcib0 at pci0 dev 7 function 0
pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev. 0x40)
[...]
ahc0 at pci0 dev 15 function 0
ahc0: interrupting at irq 11
ahc0: aic7892 Wide Channel A, SCSI Id=7, 16/255 SCBs
scsibus0 at ahc0 channel 0: 16 targets, 8 luns per target
fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
fxp0: interrupting at irq 10
fxp0: Ethernet address 00:02:b3:9c:8c:34
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
serial console errors for aha/fxp when dying:
sd1(ahc0:1:0): SCB 17 - timed out while idle, SEQADDR == 0x155
SCSIRATE == 0x0
sd1(ahc0:1:0): SCB 17: Immediate reset. Flags = 0x4040
sd1(ahc0:1:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
ahc0: target 0 using 16bit transfers
ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
ahc0: target 0 using 16bit transfers
ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
ahc0: target 1 using 16bit transfers
ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
ahc0: target 1 using 16bit transfers
ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
ahc0: Data Parity Error Detected during address or write data phase
sd0(ahc0:0:0): SCB 19 - timed out in Data-out phase, SEQADDR == 0x5d
SCSIRATE == 0x93
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): no longer in timeout, status = 0
sd0(ahc0:0:0): Unexpected busfree in Message-out phase
SEQADDR == 0x165
sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
[...repeated many times...]
sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166) SCSIRATE(0x93)
ahc0:A:0: unknown scsi bus phase e6. Attempting to continue
ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
SAVED_TCL == 0x0, ARG_1 == 0x19, SEQ_FLAGS == 0x0
[...last two lines repeated many times...]
ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE RESET
SCSIRATE == 0x0
sd0(ahc0:0:0): SCB 19: Immediate reset. Flags = 0x4050
sd0(ahc0:0:0): no longer in timeout, status = 2
ahc0: Issued Channel A Bus Reset. 9 SCBs aborted
ahc0: target 0 using 16bit transfers
ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: target 1 using 16bit transfers
ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): SCB 18 - timed out in Command phase, SEQADDR == 0x165
SCSIRATE == 0x93
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): SCB 19 - timed out in Command phase, SEQADDR == 0x165
SCSIRATE == 0x93
sd0(ahc0:0:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 12 SCBs aborted
ahc0: target 0 using 16bit transfers
ahc0: target 0 synchronous at 40.0MHz, offset = 0x7f
ahc0: target 1 using 16bit transfers
ahc0: target 1 synchronous at 40.0MHz, offset = 0x7f
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
ahc0: Interrupted for status of 0???
sd1(ahc0:1:0): queue full
sd0(ahc0:0:0): queue full
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: Interrupted for status of 0???
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
ahc0: WARNING no command for scb 22 (cmdcmplt)
QOUTPOS = 40
sd0(ahc0:0:0): queue full
ahc0: Interrupted for status of 0???
panic: biodone already
Begin traceback...
biodone(c9e56d5c,0,0,8,c194c480) at biodone+0x2d
scsipi_done(c195c40c,c1456000,c195c40c) at scsipi_done+0x146
ahc_done(c1456000,c193f460) at ahc_done+0x2e7
ahc_search_qinfifo(c1456000,0,0,0,ff) at ahc_search_qinfifo+0xfe
ahc_freeze_devq(c1456000,c194c480,71,0,c1456000) at ahc_freeze_devq+0x2a
ahc_handle_seqint(c1456000,71) at ahc_handle_seqint+0x514
ahc_intr(c1456000) at ahc_intr+0x118
Xintr11() at Xintr11+0x78
--- interrupt ---
0x805b730:
End traceback...
syncing disks... sd0(ahc0:0:0): SCB 18 - timed out in Message-in phase, SEQADDR == 0xdd
SCSIRATE == 0x93
sd0(ahc0:0:0): BDR message in message buffer
ahc0:A:0: unknown scsi bus phase b6. Attempting to continue
sd0(ahc0:0:0): SCB 19 - timed out while idle, SEQADDR == 0x38
SCSIRATE == 0x0
sd0(ahc0:0:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 6 SCBs aborted
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 10 SCBs aborted
sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd1(ahc0:1:0): Other SCB Timeout
sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 11 SCBs aborted
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): no longer in timeout, status = 0
ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd1(ahc0:1:0): Other SCB Timeout
sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0(ahc0:0:0): BDR message in message buffer
sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR == 0x165
SCSIRATE == 0x0
sd0: dk_busy < 0
panic: disk_unbusy
Begin traceback...
disk_unbusy(c1950a2c,0,5,c9efbc3c,ed036ac8) at disk_unbusy+0x31
sddone(c195c128) at sddone+0x4f
scsipi_done(c195c128,c1456000,c195c128) at scsipi_done+0xfd
ahc_done(c1456000,c193f410) at ahc_done+0x2e7
ahc_search_qinfifo(c1456000,ffffffff,41,ffffffff,ff) at ahc_search_qinfifo+0xfe
ahc_abort_scbs(c1456000,ffffffff,41,ffffffff,ff) at ahc_abort_scbs+0x60
ahc_reset_channel(c1456000,41,1,7fffffff,c193f3e8) at ahc_reset_channel+0x2c5
ahc_timeout(c193f3e8) at ahc_timeout+0x284
softclock(c1955d4c,0,ffffffff,c0198328,ecd49b88) at softclock+0x121
Xsoftclock() at Xsoftclock+0xf
--- interrupt ---
param.c(ef06a45c,ecd49b88,10,40e,308c9d) at 0x3144b80
End traceback...
dumping to dev 4,1 offset 23855
dump device bad
--
David/absolute abs@formula1.com