tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: SCSI changes - PR58452 for review



> Date: Sat, 3 Aug 2024 09:59:21 +1000
> From: Nathanial Sloss <nathanialsloss%yahoo.com.au@localhost>
> 
> I am looking to get the following patch reviewed.
> 
> It is also documented in PR/58452.
> 
> It does the following:
> 
> Abort current transfer on scsi bus reset.  This one seems logical as the state 
> of the transfer is unknown after a bus reset.

Sounds plausible.  Is it dependent on the other changes?  Can it be
split into a separate commit?

> Uses retry path for medium errors.  Bluescsi-v2 would through medium errors 
> that would succeed if the transfer was retried.

Sounds plausible.  Is it dependent on the other changes?  Can it too
be split into a separate commit?

> Biggest change is for ncr5380 which is to stop scsipi_done from being 
> reentrant in 5380sbc_done,  This helps with aborting transfers something which 
> is common with dse(4). which previously if nested to deep in recursive calls 
> to scsipi_done would cause a panic.

Why do we end up in recursion here?  Why does the driver need to
decide when to break this recursion?  If drivers sometimes _skip_
scsipi_run_queue, won't they sometimes wind up with queued xfers that
never get issued to the hardware?

It seems to me that either

(a) scsipi_run_queue -> scsipi_adapter_request -> driver adapt_request
    ought to arrange for scsipi_done to be invoked asynchronously, and
    it's a bug for it to call scsipi_done synchronously; or

(b) if it's really important for driver adapt_request to synchronously
    call scsipi_done, then that should _never_ synchronously re-enter
    scsipi_run_queue but instead should _always_ defer to an
    asynchronous trigger like a softint or something; or

(c) maybe, if it's really important for _both_ synchronous calls to
    happen, then when scsipi_done/scsipi_run_queue detects it is being
    recursively called, the inner call should detect this and do
    nothing so it can unwind to the outer call to continue processing
    the queue.

> Finally to downgrade the Aborting 0/0/0 messages to a debug only option as 
> they are harmless and no longer a warning that requires user intervention or 
> concern.

Seems fine -- maybe we should have a dtrace probe here, but I guess
that might require someone to get dtrace running on m68k!  (Wanna
volunteer?)


Home | Main Index | Thread Index | Old Index