NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create



The following reply was made to PR kern/52301; it has been noted by GNATS.

From: David Holland <dholland-bugs%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: Konrad Schroder <perseant%netbsd.org@localhost>
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Mon, 14 Jan 2019 00:50:43 +0000

 On Thu, Jun 15, 2017 at 06:15:00AM +0000, dholland%NetBSD.org@localhost wrote:
  > 1. process A calls lfs_fsync
  >    - which calls lfs_vflush
  >    - which takes the seglock and calls lfs_segwrite
  > 2. meanwhile process B calls lfs_create
  >    - which starts a dirop
  >    - and then down inside lfs_valloc blocks waiting for the seglock.
  > 3. now process A in lfs_segwrite calls lfs_writer_enter
  >    - which blocks waiting for dirops to clear
  >    - but they can't.
 
 So I was just looking into fixing this by having lfs_writer_enter
 temporarily release the seglock, which would allow lfs_create to
 finish. However, this led me to discover something I'd either not
 realized or forgotten: lfs_seglock isn't just a lock, it's closely
 tied to the segment writing mechanism; to wit, taking lfs_seglock is
 not just a locking action but also beginning to write out a segment,
 and releasing it means waiting until it's done; you can't release it
 temporarily without compromising core design invariants.
 
 But.
 
 Why, then, is lfs_valloc taking the seglock? It seems like this means
 that every call to lfs_valloc will write out its own (partial) segment
 containing just that allocation; this may technically be correct in
 the sense of not corrupting the volume, but it's certainly not
 desirable.
 
 (Even if this segment collects other pending writes, which it might --
 didn't see the logic for that but didn't look very carefully -- it
 still puts every newly created file in a different segment which will
 be unfortunate for things like untar.)
 
 This behavior seems to have been introduced for locking reasons a long
 time ago and for the moment I'm not clear on what those were or
 whether they're still relevant.
 
 Am I wrong somewhere?
 
 Cc: perseant@ in case he's awake and remembers anything helpful...
 
 -- 
 David A. Holland
 dholland%netbsd.org@localhost
 


Home | Main Index | Thread Index | Old Index