Re: Locking in disk(9) api

To: matthew green <mrg%eterna.com.au@localhost>
Subject: Re: Locking in disk(9) api
From: Adam Hamsik <haaaad%gmail.com@localhost>
Date: Tue, 29 Dec 2009 08:38:59 +0100

Hi,
On Dec,Tuesday 29 2009, at 1:49 AM, matthew green wrote:

> 
>   Yesterday tron@ submitted PR kern/42532 it's about panic which
>   happen on device-mapper devices during concurrent access to them.
>   After some investigation and help from others I have found that
>   problem is that disk_busy and disk_unbusy requires users to manage
>   locking for them. This problem wasn't found yet mostly because we
>   don't have many D_MPSAFE enabled disk drivers and because dm is
>   very specific in locking. 
> 
> the problem wasn't found in existing D_MPSAFE drivers because they
> hold a mutex around these calls, and the rest of the drivers hold
> the kernel lock.  the fss(4) driver is also "very specific in
> locking."

I looked at fss driver and I think that it is working by luck only [1]. 
disk_unbusy iw for example called without any lock held. In [2] 
disk_busy/disk_unbusy is called without any lock held, too.

[1] http://nxr.netbsd.org/xref/src/sys/dev/fss.c#1221
[2] http://nxr.netbsd.org/xref/src/sys/dev/fss.c#1059
> 
>   After some discussion I think that disk(9) api should be changed
>   to manage locking by itself and to not require explicit locking
>   from users. There are some options which can be used in this case,]
> 
>   1) Use diskp->dk_openlock to lock access to 
> iostat_busy/iostat_unbusy/iostat_isbusy
>   2) Use atomic operations in iostat_busy/iostat_unbusy/iostat_isbusy
>      to manage io_stat reference counter
>   3) Add mutex to struct io_stat and use it to guard io_stat entries
>      from concurrent manipulation. There are 4 functions which will
>      need to acquire this lock 
> iostat_busy/iostat_unbusy/iostat_isbusy/iostat_seek
>      and iostat_free
> 
> 
> you forgot to list (0) - leave things as they are, ie, continue to
> require callers to manage locking.  (also, update disk(9) to meantion
> an IPL_VM mutex instead of being at splbio().)
> 
> 
> it isn't clear to me that making every one pay the penalty of taking
> an additional lock twice for each IO is a good one to allow dm(4) to
> avoid managing locking itself.
> 
> currently most of the disk drivers are not marked D_MPSAFE and thus
> they're implicitly serialised.  ccd and fss both are marked D_MPSAFE
> and use internal locking logic to (a) ensure that disk_*busy() are
> called safely and (b) other internal logic happens safely.

For not D_MPSAFE locking and mutex should be a fast operation no ?
> 
> since (b) is really required for disk devices anyway (and indeed,
> dm has other locks it manages internally), i don't see that making
> everyone pay the dual penalty (lock/unlock) just to avoid having
> dm(4) perform this locking itself is actually a benefit.

Because e.g. for dm there is no lock which I can hold for it. In my patch I'm 
using dmv->diskp->dk_openlock which is hack I think, but it will work.
> 
> after thinking about this during the afternoon, i'm more convinced
> that the current method is OK.  dm(4) just needs a little fix, and
> i'm aware that adam has already sent a test patch for tron.

If you are ok with my patch attached below I will commit it .

-       /* FIXME: have to be called with IPL_BIO*/
+       /* FIXME: There is a problem with disk(9) api because disk_busy and
+          disk_unbusy are not protected against concurrent usage and needs
+          explicit locking from caller.*/
+       mutex_enter(&dmv->diskp->dk_openlock);
       disk_busy(dmv->diskp);
+       mutex_exit(&dmv->diskp->dk_openlock);

       /* Select active table */
       tbl = dm_table_get_entry(&dmv->table_head, DM_TABLE_ACTIVE);
@@ -459,8 +565,10 @@ dmstrategy(struct buf *bp)
       if (issued_len < buf_len)
               nestiobuf_done(bp, buf_len - issued_len, EINVAL);

-       /* FIXME have to be called with SPL_BIO*/
+       /* FIXME disk_unbusy need to manage locking by itself. */
+       mutex_enter(&dmv->diskp->dk_openlock);
       disk_unbusy(dmv->diskp, buf_len, bp != NULL ? bp->b_flags & B_READ : 0);
+       mutex_exit(&dmv->diskp->dk_openlock);

Regards

Adam.

Follow-Ups:
- Re: Locking in disk(9) api
  - From: Michael van Elst

References:
- re: Locking in disk(9) api
  - From: matthew green

Prev by Date: Securelevel stuff
Next by Date: Re: Locking in disk(9) api
Previous by Thread: re: Locking in disk(9) api
Next by Thread: Re: Locking in disk(9) api
Indexes:

Home | Main Index | Thread Index | Old Index