Subject: Re: panic
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Greg Wohletz <greg@duke.CS.UNLV.EDU>
List: current-users
Date: 11/24/1996 15:54:17
>On Sun, 24 Nov 1996 14:13:44 -0800 
> Greg Wohletz <greg@duke.CS.UNLV.EDU> wrote:
>
> > Oh, one more thing, we have 3 other servers using the same bus logic
> > controlers, but not using the ccd driver, these have been working hard for
> > several months without any hangs.  Seems like there is some strange
> > intereaction between the ccd and certain scsi disk drivers.
>
>The only thing that the ccd does is fire off transfers to the underlying
>component devices.  If the ccd is interleaved (striped), you may have
>multiple outstanding transactions on a given controller.

I never suspected the ccd, i just thought it was an interesting datapoint
that could possibly be usefull in locating a bug in the drivers.

>However, the ccd does not know or care what kind of device the underlying
>componet is.  If you're getting hangs when using the ccd on SCSI disks
>on BusLogic controllers, I'd suspect a buglet in either the sd or bha
>drivers (maybe something in an exceptional condition handler?).  Maybe
>a race condition in the bha driver?  Maybe a missing splx()?
>
>Or maybe buggy firmware on the specific BusLogic controller you had problems
>with.  Or maybe a disk that didn't behave well with a BusLogic controller.

This was my 1st guess, that is how I came to find out that the NCR
controler did not have this problem.

I don't think it is a bad interaction between disk and controler because I
was unable to reproduce the hangs unless the ccd was in use (and I tried a
variety of hardware combinations).  I figured that the ccd code queued
requests in a somehow slightly differant way that triggers some sort of
bug in the scsi or buslogic driver, but I haven't really looked into it.

						--Greg