Subject: Re: 1.5.3 fxp scsi fun
To: David Brownlee <abs@formula1.com>
From: Wojciech Puchar <wojtek@chylonia.3miasto.net>
List: port-i386
Date: 07/13/2002 23:26:27
>
> =09We have two 1.5.3 athlon webservers which seem to be having some
> =09scsi/ethernet issues. Occasionally the machines just die.
> =09One has an siop, and the other an ahc. I believe they are both VIA
> =09chipsets. Does this ring a bell for anyone?

not sure how about new VIA designs, but when i had VIA based K6/300 system
setting off "PCI write buffer" solved the problem. And with not visible
performance degradation.

It was with both linux and NetBSD. i had Symbios logic SCSI and S3 Trio
graphics. with adaptec SCSI had same effect, with other GFX cards too.

>
> siop/fxp machine:
>
>     cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
>     total memory =3D 1023 MB
>     avail memory =3D 869 MB
>     using 8192 buffers containing 128 MB of memory
>     BIOS32 rev. 0 found at 0xfb5b0
>     mainbus0 (root)
>     pci0 at mainbus0 bus 0: configuration mode 1
>     pci0: i/o space, memory space enabled
>     pchb0 at pci0 dev 0 function 0
>     pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
>     ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f =
(rev. 0x00)
>     pci1 at ppb0 bus 1
>     pci1: i/o space, memory space enabled
>     pcib0 at pci0 dev 7 function 0
>     pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.=
 0x40)
>     [...]
>     siop0 at pci0 dev 15 function 0: Symbios Logic 53c895 (ultra2-wide sc=
si)
>     siop0: using on-board RAM
>     siop0: interrupting at irq 11
>     scsibus0 at siop0: 16 targets, 8 luns per target
>     fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
>     fxp0: interrupting at irq 10
>     fxp0: Ethernet address 00:02:b3:9c:3e:ad
>     inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
>     inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>
> serial console errors for siop/fxp when dying:
>
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0: WARNING: SCB timed out!
>     fxp0 at line 2098: dmasync timeout
>     fxp0: WARNING: SCB timed out!
>     fxp0 at line 1617: dmasync timeout
>     sd0(siop0:1:0): command timeout
>     sd0(siop0:1:0): command timeout
>     sd0(siop0:1:0): command timeout
>     sd0(siop0:1:0): command timeout
>
> aha/fxp machine:
>     cpu0: AMD K7 (Athlon) (686-class), 1602.14 MHz
>     total memory =3D 1023 MB
>     avail memory =3D 869 MB
>     using 8192 buffers containing 128 MB of memory
>     BIOS32 rev. 0 found at 0xfb5b0
>     mainbus0 (root)
>     pci0 at mainbus0 bus 0: configuration mode 1
>     pci0: i/o space, memory space enabled
>     pchb0 at pci0 dev 0 function 0
>     pchb0: Advanced Micro Devices product 0x700e (rev. 0x13)
>     ppb0 at pci0 dev 1 function 0: Advanced Micro Devices product 0x700f =
(rev. 0x00)
>     pci1 at ppb0 bus 1
>     pci1: i/o space, memory space enabled
>     pcib0 at pci0 dev 7 function 0
>     pcib0: VIA Technologies VT82C686A (Apollo KX133) PCI-ISA Bridge (rev.=
 0x40)
>     [...]
>     ahc0 at pci0 dev 15 function 0
>     ahc0: interrupting at irq 11
>     ahc0: aic7892 Wide Channel A, SCSI Id=3D7, 16/255 SCBs
>     scsibus0 at ahc0 channel 0: 16 targets, 8 luns per target
>     fxp0 at pci0 dev 17 function 0: i82550 Ethernet, rev 12
>     fxp0: interrupting at irq 10
>     fxp0: Ethernet address 00:02:b3:9c:8c:34
>     inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
>     inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
>
> serial console errors for aha/fxp when dying:
>
>     sd1(ahc0:1:0): SCB 17 - timed out while idle, SEQADDR =3D=3D 0x155
>     SCSIRATE =3D=3D 0x0
>     sd1(ahc0:1:0): SCB 17: Immediate reset.  Flags =3D 0x4040
>     sd1(ahc0:1:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
>     ahc0: target 0 using 16bit transfers
>     ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: target 0 using 16bit transfers
>     ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: target 1 using 16bit transfers
>     ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: target 1 using 16bit transfers
>     ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: Data Parity Error Detected during address or write data phase
>     sd0(ahc0:0:0): SCB 19 - timed out in Data-out phase, SEQADDR =3D=3D 0=
x5d
>     SCSIRATE =3D=3D 0x93
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     sd0(ahc0:0:0): Unexpected busfree in Message-out phase
>     SEQADDR =3D=3D 0x165
>     sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
 SCSIRATE(0x93)
>     [...repeated many times...]
>     sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
 SCSIRATE(0x93)
>     sd0(ahc0:0:0): parity error detected in Data-in phase. SEQADDR(0x166)=
 SCSIRATE(0x93)
>     ahc0:A:0: unknown scsi bus phase e6.  Attempting to continue
>     ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE =
RESET
>     SAVED_TCL =3D=3D 0x0, ARG_1 =3D=3D 0x19, SEQ_FLAGS =3D=3D 0x0
>     [...last two lines repeated many times...]
>     ahc0:A:0: no active SCB for reconnecting target - issuing BUS DEVICE =
RESET
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): SCB 19: Immediate reset.  Flags =3D 0x4050
>     sd0(ahc0:0:0): no longer in timeout, status =3D 2
>     ahc0: Issued Channel A Bus Reset. 9 SCBs aborted
>     ahc0: target 0 using 16bit transfers
>     ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: target 1 using 16bit transfers
>     ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): SCB 18 - timed out in Command phase, SEQADDR =3D=3D 0x=
165
>     SCSIRATE =3D=3D 0x93
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): SCB 19 - timed out in Command phase, SEQADDR =3D=3D 0x=
165
>     SCSIRATE =3D=3D 0x93
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 12 SCBs aborted
>     ahc0: target 0 using 16bit transfers
>     ahc0: target 0 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: target 1 using 16bit transfers
>     ahc0: target 1 synchronous at 40.0MHz, offset =3D 0x7f
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     ahc0: Interrupted for status of 0???
>     sd1(ahc0:1:0): queue full
>     sd0(ahc0:0:0): queue full
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: Interrupted for status of 0???
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     ahc0: WARNING no command for scb 22 (cmdcmplt)
>     QOUTPOS =3D 40
>     sd0(ahc0:0:0): queue full
>     ahc0: Interrupted for status of 0???
>     panic: biodone already
>     Begin traceback...
>     biodone(c9e56d5c,0,0,8,c194c480) at biodone+0x2d
>     scsipi_done(c195c40c,c1456000,c195c40c) at scsipi_done+0x146
>     ahc_done(c1456000,c193f460) at ahc_done+0x2e7
>     ahc_search_qinfifo(c1456000,0,0,0,ff) at ahc_search_qinfifo+0xfe
>     ahc_freeze_devq(c1456000,c194c480,71,0,c1456000) at ahc_freeze_devq+0=
x2a
>     ahc_handle_seqint(c1456000,71) at ahc_handle_seqint+0x514
>     ahc_intr(c1456000) at ahc_intr+0x118
>     Xintr11() at Xintr11+0x78
>     --- interrupt ---
>     0x805b730:
>     End traceback...
>     syncing disks... sd0(ahc0:0:0): SCB 18 - timed out in Message-in phas=
e, SEQADDR =3D=3D 0xdd
>     SCSIRATE =3D=3D 0x93
>     sd0(ahc0:0:0): BDR message in message buffer
>     ahc0:A:0: unknown scsi bus phase b6.  Attempting to continue
>     sd0(ahc0:0:0): SCB 19 - timed out while idle, SEQADDR =3D=3D 0x38
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 6 SCBs aborted
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 10 SCBs aborted
>     sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd1(ahc0:1:0): Other SCB Timeout
>     sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 11 SCBs aborted
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): no longer in timeout, status =3D 0
>     ahc0: Issued Channel A Bus Reset. 8 SCBs aborted
>     sd1(ahc0:1:0): SCB 17 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd1(ahc0:1:0): Other SCB Timeout
>     sd0(ahc0:0:0): SCB 1b - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0(ahc0:0:0): BDR message in message buffer
>     sd0(ahc0:0:0): SCB 19 - timed out in Message-out phase, SEQADDR =3D=
=3D 0x165
>     SCSIRATE =3D=3D 0x0
>     sd0: dk_busy < 0
>     panic: disk_unbusy
>     Begin traceback...
>     disk_unbusy(c1950a2c,0,5,c9efbc3c,ed036ac8) at disk_unbusy+0x31
>     sddone(c195c128) at sddone+0x4f
>     scsipi_done(c195c128,c1456000,c195c128) at scsipi_done+0xfd
>     ahc_done(c1456000,c193f410) at ahc_done+0x2e7
>     ahc_search_qinfifo(c1456000,ffffffff,41,ffffffff,ff) at ahc_search_qi=
nfifo+0xfe
>     ahc_abort_scbs(c1456000,ffffffff,41,ffffffff,ff) at ahc_abort_scbs+0x=
60
>     ahc_reset_channel(c1456000,41,1,7fffffff,c193f3e8) at ahc_reset_chann=
el+0x2c5
>     ahc_timeout(c193f3e8) at ahc_timeout+0x284
>     softclock(c1955d4c,0,ffffffff,c0198328,ecd49b88) at softclock+0x121
>     Xsoftclock() at Xsoftclock+0xf
>     --- interrupt ---
>     param.c(ef06a45c,ecd49b88,10,40e,308c9d) at  0x3144b80
>     End traceback...
>
>     dumping to dev 4,1 offset 23855
>     dump device bad
>
>
> --
> =09    David/absolute=09=09abs@formula1.com
>

--------------------------------------------------------------------
Charakterystycznymi cechami rozwoju oprogramowania jest wyk=B3adniczy
wzrost wymaga=F1 sprz=EAtowych, kwadratowy wzrost ilo=B6ci b=B3=EAd=F3w, li=
niowy
wzrost ilo=B6ci bajer=F3w przy mniej ni=BF liniowym wzro=B6cie funkcjonalno=
=B6ci