Subject: Re: SCSI bus parity error
To: None <port-pmax@netbsd.org>
From: Toru Nishimura <nisimura@itc.aist-nara.ac.jp>
List: port-pmax
Date: 06/24/2000 15:58:52
>> During stress test of tape drive access, my MAXINE paniced twice complaining
>> "SCSI bus parity error".  Is it a hardware problem, or possible software
>> driver implemention flaw?  The tape drive is TKZ15, rebadged EXB-85xx,
>> and is running 2GB+ worth of data repeatedly.
>
> I got the similar timeout report on tape access when I was developing
> pcscp driver, which also uses MI ncr53c9x.
> (It was not parity error, though.)
>
> The problem on pcscp caused on handling of transfer pad operations.
> When the situations occur, NCRDMA_SETUP() will be called
> with *dmasize == 0.

Sounds interesting.  Other than MAXINE, my DEC3000/300 is observed to
reveal obsure messages for 4mm DAT drive;

--
scsibus0: waiting 2 seconds for devices to settle...
probe(asc0:0:0): max sync rate 5.00MB/s
sd0 at scsibus0 target 0 lun 0: <DEC, RZ28B    (C) DEC, 0003> SCSI2 0/direct fixed
sd0: 2007 MB, 3045 cyl, 16 head, 84 sec, 512 bytes/sect x 4110480 sectors
probe(asc0:1:0): max sync rate 5.00MB/s
st0 at scsibus0 target 1 lun 0: <HP, C1533A, 9608> SCSI2 1/sequential removable
st0: density code 36, variable blocks, write-enabled
probe(asc0:4:0): max sync rate 5.00MB/s
--

When the drive is instructed 'mt -f /dev/rst0 rewind' with media loaded,
kernel will immediately complain with a bunch of messages like as;

--
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d228 (flags 0x103, dleft 0, st
at 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 7), resid 0, msg(q 0,o 0) >
st0(asc0:1:0): ncr53c9x_abort: not NEXUS
st0(asc0:1:0): st0(asc0:1:0): max sync rate 5.00MB/s
asc0: timed out [ecb 0xfffffe000002d228 (flags 0x43, dleft 0, stat 0)], <state 4
, nexus 0xfffffe000002d228, phase(l 17, c 7, p 7), resid 0, msg(q 0,o 40) > AGAI
N
st0(asc0:1:0): st0(asc0:1:0): max sync rate 5.00MB/s
asc0: timed out [ecb 0xfffffe000002d228 (flags 0x3, dleft 0, stat 0)], <state 4, nexus 0xfffffe000002d228, phase(l 17, c 7, p 7), resid 0, msg(q 0,o 40) >
sd0(asc0:0:0): max sync rate 5.00MB/s
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d228 (flags 0x103, dleft 0, st at 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 7), resid 0, msg(q 0,o 0) >
st0(asc0:1:0): ncr53c9x_abort: not NEXUS
sd0(asc0:0:0): max sync rate 5.00MB/s
st0(asc0:1:0): st0(asc0:1:0): max sync rate 5.00MB/s
asc0: timed out [ecb 0xfffffe000002d228 (flags 0x43, dleft 0, stat 0)], <state 4 , nexus 0xfffffe000002d228, phase(l 17, c 7, p 7), resid 0, msg(q 0,o 40) > AGAIN
st0: error 5 in st_load (op 1)
sd0(asc0:0:0): max sync rate 5.00MB/s
st0(asc0:1:0): st0(asc0:1:0): max sync rate 5.00MB/s
asc0: timed out [ecb 0xfffffe000002d228 (flags 0x3, dleft 0, stat 0)], <state 4, nexus 0xfffffe000002d228, phase(l 17, c 7, p 7), resid 0, msg(q 0,o 40) >
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d228 (flags 0x1, dleft 0, stat 0)], <state 2, nexus 0xfffffe000002d228, phase(l 12, c 100, p 6), resid 0, msg( q 0,o 20) >
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d228 (flags 0x103, dleft 0, st at 0)], <state 1, nexus 0x0, phase(l 10, c 100, p 7), resid 0, msg(q 0,o 0) >
st0(asc0:1:0): ncr53c9x_abort: not NEXUS
sd0(asc0:0:0): max sync rate 5.00MB/s
st0(asc0:1:0): max sync rate 5.00MB/s
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d228 (flags 0x43, dleft 0, sta
t 0)], <state 4, nexus 0xfffffe000002d228, phase(l 12, c 2, p 2), resid 0, msg(q
 0,o 40) > AGAIN
st0(asc0:1:0): asc0: timed out [ecb 0xfffffe000002d2b8 (flags 0x3, dleft 0, stat 0)], <state 4, nexus 0xfffffe000002d2b8, phase(l 16, c 6, p 6), resid 0, msg(q 0,o 40) DMA active>st0(asc0:1:0): max sync rate 5.00MB/s
st0: error 5 trying to rewind
sd0(asc0:0:0): max sync rate 5.00MB/s
sd3(asc0:6:0): max sync rate 5.00MB/s
sd2(asc0:5:0): max sync rate 5.00MB/s
sd1(asc0:4:0): max sync rate 5.00MB/s
--
I'm now tring to solve 4MAX+ xasc problem, and yet to test the same
4mm DAT tape drive with MAXINE.  Besides PR#8645 and PR#10031
something wrong is around tape drive, more specifically, "odd access"
other than blocked filesystem access, in the SCSI driver. 

Tohru Nishimura