Subject: Re: NCR PCI SCSI bugs/fixes? Help!
To: Michael L. VanLoon -- Iowa State University <michaelv@iastate.edu>
From: Wayne Berke <berke@panix.com>
List: port-i386
Date: 01/24/1995 10:03:08
In message <199501212129.PAA07637@bigbrother.tele.iastate.edu>, "Michael L. Van
Loon -- Iowa State University" writes:
> 
> [... description of problem with NCR 53C810 controller ...]
>

I'm not sure if my problem is related, but if it is it may just be a stroke
of good luck.

My problem is with a P/90 box that has an NCR 53C810 card that controls
a 730M Quantum Lightning disk and a double speed SANYO CD-ROM drive
and I installed NetBSD and MS-DOS onto it a couple of weeks ago.
I bought the InfoMagic disk to load the various source and binary
distributions.  However, I found that whenever I tried to transfer
large files from the CD, it would generate an NCR exception that effectively
froze the machine until I hit the reset button.  This happened whether I
was transferring files to disk or simply piping them gunzip and tar tv.
If I transferred a large enough file, it would always happen.  The
console message was:

	ncr0 targ 1?: ERROR (81:50:a7) (e0/18) @ (36f900:48000000)
	ncr0 targ 1?: ERROR (80:4:a7) (e0/18) @ (36f900:48000000)

These values with the exception of the 36f900 do not vary.  These messages
are then followed by a series of:

	ncr0: timeout ccb=f86d1800 (skip)
	ncr0: timeout ccb=f86d1600 (skip)
	ncr0: timeout ccb=f86d1800 (skip)
	ncr0: timeout ccb=f86d1600 (skip)
	...

I admit I'm no device driver guru and although I've located the part of
ncr.c that generates these messages, I don't have sufficient knowledge
of the SCSI protocol or how this board works to understand them.

However the good news is that I can completely fix the problem by booting
MS-DOS before booting NetBSD.  After doing this, I was able to copy the
entire X11R6 source directory from CD to my disk without problem.  If I then
power off the machine and boot UNIX first, I get the NCR exceptions again.
In fact I've isolated what actually fixes it to loading the device driver
DOSCAM.SYS that NCR provides for DOS.  It would seem that this driver
is doing some kind of initialization that is needed by the card but that
is not done in the current NetBSD (and perhaps FreeBSD?) driver.

If this is true it may be related to problems with the disk interface as
well.  I have seen some disk effects as well on my machine (0 filled files
after being untarred disk to disk - redoing the untar from the same area
was successful).  However until this gets fixed I will always boot DOS
before rebooting NetBSD and hope that that will protect me from disk
problems as well.

Since this error is so easily reproducible on my machine, I'd gladly volunteer
to run whatever tests the gurus can think of.  As you can guess it's reliably
reproducible that it will fail but not when it will fail.  If I try to
read at least a megabyte from the CD, it will always fail.  Also, if anyone
shares a DOS partition and has disk problems with this card, I could send
them a copy of DOSCAM.SYS to see if it clears things up.

Before we all start swapping out our cards for Buslogic, maybe there's a
simple solution here.  It's got to be easier to fix initialization code
than the timing-dependent transfer code.

======================================================================
Wayne Berke
berke@panix.com