Subject: Re: Problems with ccd (960413)
To: Charles M. Hannum <mycroft@mit.edu>
From: Justin T. Gibbs <gibbs@freefall.freebsd.org>
List: current-users
Date: 05/15/1996 07:36:39
>
>"Justin T. Gibbs" <gibbs@freefall.freebsd.org> writes:
>
>> 
>> >It appears all that was done was to make the start routines not use
>> >SCSI_NOSLEEP when called from the strategy routine.  It is still set
>> >when called from an interrupt (i.e. on command completion), so if you
>> >have enough I/O pending to overflow the SCBs already allocated, you
>> >will still lose.  This is made more noticable by ccd because it can
>> >have the effect of queueing many requests at once to the same
>> >controller.
>> 
>> So the difference becomes one of always setting SCSI_NOSLEEP or only
>> setting it at interrupt time when it is valid.  At interrupt time, the only
>> case (as I recall - haven't looked at the code lately) that we start a scsi
>> command is for a retry of a failed command, which implies that the number
>> of openings is >= 1.
>
>That's distinctly untrue.  If there are no SCBs available when a
>command is queued, and one can't be allocated without sleeping, then
>the driver, under FreeBSD as well, will return TRY_AGAIN_LATER.

Only if SCSI_NOSLEEP is set.  In FreeBSD, I maintain that it is never set
when we hit this code and need to sleep, but in NetBSD it always is.  I
never said it wouldn't have this behavior if SCSI_NOSLEEP was set and all
resources were exausted.  There wasn't one "distinctly untrue" statement in
what I said.

>This
>command will be retried for a few seconds, and if it can't be
>executed, the SCSI layer will eventually give up.  This won't cause a
>`not queued' message, but it will still do the wrong thing, and for
>more or less the same reason.

Sure.

>> At least for a FreeBSD system running ccd to 6 disks
>> of a 2940 (16 SCBs), the failure that was reported for NetBSD never occurs
>> and I know that there are more than 16 active transaction in this system.
>
>You could be getting lucky by allocating the SCBs earlier for some
>other reason, but it's still broken.

It doesn't matter when you allocate the SCBs (memory allocation).  You will
still have (using 8 tags) 48 SCSI transactions trying to use those 16 SCBs
and they must be sleeping *somewhere* for this to work.  They certainly
aren't sleeping in the upper level SCSI code because the number of total
openings is greater than the number of SCBs, so they must be sleeping in
my driver.

>> This is not a problem because only requests generated outside of an
>> interrupt context will cause you to rise above your (previous) threashold.
>
>That's not true, either.  If you increase `openings' in an interrupt
>context (as is done in ahc_done()) and then wake up a higher-level
>driver (through scsi_done()), the higher-level driver may immediately
>attempt to queue more commands than there are currently SCBs available
>for (from the interrupt context), and this lossage mode will ensue.

wakeup schedules the sleepers to run, but they don't run until we're out of
the interrupt context.  A process that was asleep can go to sleep again on
those resources, so you don't need SCSI_NOSLEEP.  free_xs will start
exactly one transaction for each transaction freed if it doesn't do the
wakeup which means again that you will only consume the resource you just
freed and never go above the "previous openings" level that was set even
if you bumped the opening count during that interrupt context.  You will
not malloc from an interrupt context and you shouldn't ever need to sleep.
Now you may say that having free_xs only start one transaction is a bug,
but seeing as we may be in an interrupt context and don't want to unduly
block other interrupts, I don't think it is.

>For this hack to work, `openings' must *only* be increased from
>ahc_scsi_cmd(), when SCSI_NOSLEEP is not set.  Therefore there is a
>bug in the ahc driver; it violates an invariant of the SCSI system.

I don't think so.

>> >As I've said before, the best solution is to eliminate the need for
>> >SCSI_NOSLEEP.  This requires more restructuring of the SCSI code,
>> >however.
>> 
>> Hey, that was what I said! B-) But you called my plan for doing it a hack.
>> The plan was to basically restructure the SCSI system so that you only
>> sleep in one place.  So, a process might sleep at the time it queued an I/O
>> request waiting for its scsi_xfer struct to be allocated/reserved, but that
>> once it got that structure, the controller resources would already be
>> attached to it and no further sleeping would be needed.  This also ensures
>> that all mallocing is done outside of an interrupt context.
>
>That was *my* suggestion.  You proposed the previously mentioned hack
>as an alternative.

I recall things a little differently, but if the above is *your*
suggestion, it has the same "problem" you think exists in the current
system.  You still need to adjust the openings at some point, and you
don't want to pre-allocate the memory space (unless you have a busmastering
ISA controller where you need to ensure the structures are below 16meg),
so, your opening count could still be larger than the number of allocated
SCBs when a scsi_done occurs and you perform a wakeup.  Of course this
isn't a problem as it isn't a problem in the current system.

--
Justin T. Gibbs
===========================================
  FreeBSD: Turning PCs into workstations
===========================================