NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create

The following reply was made to PR kern/52301; it has been noted by GNATS.

From: Konrad Schroder <>
To: David Holland <>,
Cc: Konrad Schroder <>
Subject: Re: kern/52301: lfs deadlock between lfs_fsync and lfs_create
Date: Fri, 18 Jan 2019 08:56:53 -0800

 My recollection is that lfs_valloc takes the segment lock because 
 otherwise the Ifile free list could become corrupted on disk if a 
 segment was being written at the same time.  The segment lock doesn't 
 cause a segment to be written when it is released, per se, so there's no 
 harm in using it to prevent this.
 The handling of dirops needs to be thought out very carefully.  The easy 
 way would be to prohibit dirops from beginning when a segment write is 
 in progress.  This would indicate that the locking order in your point 
 #2 below is wrong.  It might end up starving dirops, however.
 A better approach might be to note, when taking the segment lock, 
 whether dirops are ready to be written, and only prohibit new dirops in 
 that case...but if the segment lock was taken when no dirops are ready, 
 the segment writer wouldn't write any VDIROP vnodes and wouldn't wait 
 for dirops to complete before doing anything.
 Take care,
 On 1/13/19 4:50 PM, David Holland wrote:
 > On Thu, Jun 15, 2017 at 06:15:00AM +0000, wrote:
 >   > 1. process A calls lfs_fsync
 >   >    - which calls lfs_vflush
 >   >    - which takes the seglock and calls lfs_segwrite
 >   > 2. meanwhile process B calls lfs_create
 >   >    - which starts a dirop
 >   >    - and then down inside lfs_valloc blocks waiting for the seglock.
 >   > 3. now process A in lfs_segwrite calls lfs_writer_enter
 >   >    - which blocks waiting for dirops to clear
 >   >    - but they can't.
 > So I was just looking into fixing this by having lfs_writer_enter
 > temporarily release the seglock, which would allow lfs_create to
 > finish. However, this led me to discover something I'd either not
 > realized or forgotten: lfs_seglock isn't just a lock, it's closely
 > tied to the segment writing mechanism; to wit, taking lfs_seglock is
 > not just a locking action but also beginning to write out a segment,
 > and releasing it means waiting until it's done; you can't release it
 > temporarily without compromising core design invariants.
 > But.
 > Why, then, is lfs_valloc taking the seglock? It seems like this means
 > that every call to lfs_valloc will write out its own (partial) segment
 > containing just that allocation; this may technically be correct in
 > the sense of not corrupting the volume, but it's certainly not
 > desirable.
 > (Even if this segment collects other pending writes, which it might --
 > didn't see the logic for that but didn't look very carefully -- it
 > still puts every newly created file in a different segment which will
 > be unfortunate for things like untar.)
 > This behavior seems to have been introduced for locking reasons a long
 > time ago and for the moment I'm not clear on what those were or
 > whether they're still relevant.
 > Am I wrong somewhere?
 > Cc: perseant@ in case he's awake and remembers anything helpful...

Home | Main Index | Thread Index | Old Index