Subject: Re: status of ncr 53c810 scsi adapter?
To: Mark H. Levine <yba@polytronics.com>
From: Cliff Romash <romash@BBN.COM>
List: port-alpha
Date: 02/19/1998 17:21:24
Mark's description below is a problem I have recently seen here at GTE.
We have  8 systems with EB164 motherboards and Intraserver ITI-3140U
controllers with NCR 53c875 chips. All but one are using Seagate ST35401W
drives. Until two weeks ago, I had never seen a problem. But two weeks
ago, we received two new systems, and within 3 hours, were seeing a boot
time problem initializing the disks. I have since seen the same problem
on one of our old systems, but can offer not clues as to what might be
happening. 


After several power off/on cycles, we have always been able to get the
machine to boot. 


Any clues as to how to make this problem go away would be appreciated.


I'm including the console log from one of the failed boots at the end of
this message.


Cliff Romash





At 01:09 PM 2/19/98 -0500, Mark H. Levine wrote:

>

>   Charles Lepple writes:

>   > What is the current status of the code for the 53c810? I keep
seeing

>   > references to how it is buggy, and that it occasionally doesn't
work.

>   > 

>   > Essentially, I am trying to get an estimate of the percentage of
uptime

>   > I would have with a UDB with said SCSI adapter :-)

>   

>   The driver's reliability is bimodal. On a machine/disk combo on
which

>   it works, it is nearly 100% reliable. On a machine/disk combo on
which

>   it does not work, it will fail to work for more than a few seconds
to

>   few minutes. You will not operate the thing for several weeks only
to

>   have a failure -- you'll know very fast if you have trouble.

>   

>   Perry

>

>Hmm, that has not precisely been my experience with the NCR driver,
although

>it did seem to mostly work on the UDB box.  On newer boxes, it
exhibited

>the behavior of working until asked to do something like a large
tranfer,

>say tar/untar of the sources or toolchain sources, and then it would
fail,

>in a way that indicates it has little or no error recovery code. 
Typically

>the driver would log that it had received a scsi error, then start
timing

>out and logging the timeouts, then hang forever instead of trying to

>restart the controller and continue, requiring a reboot.  We did not
see

>that behavior at all on the 20164/66 boxes, but did see it on 21164
eb164s

>and pc164s.

>

>There was some loose talk here about giving this driver some priority
because

>of its support in SRM and being built-in to eb164 systems... is there
actually

>anyone doing development?  If not, is there anyone who can give pointers
to

>documentation on how to correct NCR scripts?  I've recently heard of a
new

>failure mode with Intraserver controllers and the pc164 that keeps the
machine

>from boostrapping once the kernel device driver has control, with
latest

>versions of pc164s and Seagate drives, so I have some interest in
looking at

>same.

>

>

>


<bigger>8192 byte page size, 1 processor.

real mem = 536870912 (2490368 reserved for PROM, 534380544 used by
NetBSD)

avail mem = 463233024

using 6523 buffers containing 53436416 bytes of memory

mainbus0 (root)

cpu0 at mainbus0: ID 0 (primary), 21164A (pass 2)

cia0 at mainbus0

pci0 at cia0 bus 0

ncr0 at pci0 dev 5 function 0: NCR 53c875 Wide SCSI

ncr0: interrupting at eb164 irq 2

        Delay (GEN=11): 236 msec

        Delay (GEN=11): 194 msec

        Delay (GEN=11): 194 msec

        NCR clock is 46871KHz, 46871KHz

        initial value of SCNTL3 = 05, final = 35

ncr0: restart (scsi reset).

scsibus0 at ncr0: 16 targets

sd0 at scsibus0 targ 0 lun 0: <<SEAGATE, ST34501W, 0018> SCSI2 0/direct
fixed

sd0: sd0(ncr0:0:0): WIDE SCSI (16 bit) enabled

sd0(ncr0:0:0): 20.0 MB/s (100 ns, offset 15)

ncr0: aborting job ...

ncr0:0: ERROR (10:0) (1-21-0) (f/3d) @ (d8c:1900001c).

        script cmd = 89030000

        reg:     da 10 80 3d 47 0f 00 0f 03 01 80 21 00 01 01 09.

ncr0: have to clear fifos.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

ncr0: aborting job ...

ncr0:0: ERROR (90:0) (0-21-27) (0/35) @ (418:430000b0).

        script cmd = 878b0000

        reg:     da 00 00 35 47 00 00 0f 71 00 00 21 80 01 00 0a.

ncr0: restart (fatal error).

sd0(ncr0:0:0): COMMAND FAILED (9 ff) @0xfffffe004a6d3400.

sd0: could not mode sense (4/5); using fictitious geometry

</bigger>