Subject: Detaching live sd devices
To: None <tech-kern@netbsd.org>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 07/22/2005 18:02:24
--zYM0uCDKw75PZbzx
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I am working on a SCSI HBA that can detach SCSI disks. Well, it can detach=
=20
the SCSI bus, but sd's are the most common. ;-)

Problem is that if there is a mounted file system when the detach happens,=
=20
things blow up. As in panic().

Specifically, it seems that the problem is we call bufq_free() before we
have drained off all pending i/o. When the pending i/o finishes,
scsipi_done() will end up calling sdstart(). I don't have the back trace
in front of me (a coworker first saw the problem). As best I can tell,
scsipi_done() calls scsipi_complete(), which (for XS_CTL_ASYNC) will call
scsipi_put_xs(), which will call (*psw_start)(). That's sdstart(), which
then looks to see if there's work to do by calling
BUFQ_PEEK(&sd->buf_queue) to get a bp. However, when we did the=20
bufq_free(), we wipded that info out (we set the "get" routine pointer to=
=20
NULL).

So we either need to call bufq_free() later, or we need some other way to=
=20
keep sdstart from looking at the queue.

Thoughts? I have a few, but I'd appreciate input.

My main thought at the moment is to change the scsipi_periphsw struct so=20
that we no longer have a start routine. That will keep scsipi_put_xs()=20
from calling it (it checks for non-NULL).

Take care,

Bill

--zYM0uCDKw75PZbzx
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (NetBSD)

iD8DBQFC4ZcgWz+3JHUci9cRAhAgAJ9XTL+tPSC9688IB6AVn+5x+fo4YgCfRnLJ
enuZheaorndy/knRX/wZ1Ls=
=4PYK
-----END PGP SIGNATURE-----

--zYM0uCDKw75PZbzx--