re: Locking in disk(9) api

To: Adam Hamsik <haaaad%gmail.com@localhost>
Subject: re: Locking in disk(9) api
From: matthew green <mrg%eterna.com.au@localhost>
Date: Tue, 29 Dec 2009 11:49:29 +1100

   
   Yesterday tron@ submitted PR kern/42532 it's about panic which
   happen on device-mapper devices during concurrent access to them.
   After some investigation and help from others I have found that
   problem is that disk_busy and disk_unbusy requires users to manage
   locking for them. This problem wasn't found yet mostly because we
   don't have many D_MPSAFE enabled disk drivers and because dm is
   very specific in locking. 

the problem wasn't found in existing D_MPSAFE drivers because they
hold a mutex around these calls, and the rest of the drivers hold
the kernel lock.  the fss(4) driver is also "very specific in
locking."
   
   After some discussion I think that disk(9) api should be changed
   to manage locking by itself and to not require explicit locking
   from users. There are some options which can be used in this case,]
   
   1) Use diskp->dk_openlock to lock access to 
iostat_busy/iostat_unbusy/iostat_isbusy
   2) Use atomic operations in iostat_busy/iostat_unbusy/iostat_isbusy
      to manage io_stat reference counter
   3) Add mutex to struct io_stat and use it to guard io_stat entries
      from concurrent manipulation. There are 4 functions which will
      need to acquire this lock 
iostat_busy/iostat_unbusy/iostat_isbusy/iostat_seek
      and iostat_free


you forgot to list (0) - leave things as they are, ie, continue to
require callers to manage locking.  (also, update disk(9) to meantion
an IPL_VM mutex instead of being at splbio().)


it isn't clear to me that making every one pay the penalty of taking
an additional lock twice for each IO is a good one to allow dm(4) to
avoid managing locking itself.

currently most of the disk drivers are not marked D_MPSAFE and thus
they're implicitly serialised.  ccd and fss both are marked D_MPSAFE
and use internal locking logic to (a) ensure that disk_*busy() are
called safely and (b) other internal logic happens safely.

since (b) is really required for disk devices anyway (and indeed,
dm has other locks it manages internally), i don't see that making
everyone pay the dual penalty (lock/unlock) just to avoid having
dm(4) perform this locking itself is actually a benefit.


after thinking about this during the afternoon, i'm more convinced
that the current method is OK.  dm(4) just needs a little fix, and
i'm aware that adam has already sent a test patch for tron.


.mrg.

Follow-Ups:
- Re: Locking in disk(9) api
  - From: Matthias Scheler
- Re: Locking in disk(9) api
  - From: Adam Hamsik

References:
- Locking in disk(9) api
  - From: Adam Hamsik

Prev by Date: Locking in disk(9) api
Next by Date: Checking for inserted floppy
Previous by Thread: Locking in disk(9) api
Next by Thread: Re: Locking in disk(9) api
Indexes:

Home | Main Index | Thread Index | Old Index