tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: vrele vs. syncer deadlock



> On 11 Dec 2016, at 22:33, Nick Hudson <skrll%netbsd.org@localhost> wrote:
> 
> On 12/11/16 21:05, J. Hannken-Illjes wrote:
>>> On 11 Dec 2016, at 21:01, David Holland <dholland-tech%netbsd.org@localhost> wrote:
>>> 
>>> On a low-memory machine Nick ran into the following deadlock:
>>> 
>>>  (a) rename -> vrele on child -> inactive -> truncate -> getblk ->
>>>      no memory in buffer pool -> wait for syncer
>>>  (b) syncer waiting for locked parent vnode from the rename
<snip>
>> Where is the syncer waiting for the parent?
> 
> db>  bt/a  ffffffff8ff28060
> pid  0.37  at  0x9800000410960000
> 0x9800000410961bb0:  kernel_text+dc  (0,0,0,0)  ra  ffffffff803ad484  sz  0
> 0x9800000410961bb0:  mi_switch+1c4  (0,0,0,0)  ra  ffffffff803a9ef8  sz  96
> 0x9800000410961c10:  sleepq_block+b0  (0,0,0,0)  ra  ffffffff803b8edc  sz  64
> 0x9800000410961c50:  turnstile_block+2e4  (0,0,0,0)  ra  ffffffff803a487c  sz  96
> 0x9800000410961cb0:  rw_enter+17c  (0,0,0,0)  ra  ffffffff8044862c  sz  112
> 0x9800000410961d20:  genfs_lock+8c  (0,0,0,0)  ra  ffffffff8043fd60  sz  48
> 0x9800000410961d50:  VOP_LOCK+30  (ffffffff8049d4c8,2,0,0)  ra  ffffffff80436c8c  sz  48
> 0x9800000410961d80:  vn_lock+94  (ffffffff8049d4c8,2,0,0)  ra  ffffffff803367d8  sz  64
> 0x9800000410961dc0:  ffs_sync+c8  (ffffffff8049d4c8,2,0,0)  ra  ffffffff80428f4c  sz  112
> 0x9800000410961e30:  sched_sync+1c4  (ffffffff8049d4c8,2,0,0)  ra  ffffffff80228dd0  sz  112
> 0x9800000410961ea0:  mips64r2_lwp_trampoline+18  (ffffffff8049d4c8,2,0,0)  ra  0  sz  32
> 
> 
> 
>> Which file system?
> 
> ffs

Looks like a bug introduced by myself.  Calling ffs_sync() from the
syncer (MNT_LAZY set) will write back modified inodes only, fsync
is handled by individual synclist entries.

Some time ago I unconditionally removed LK_NOWAIT from vn_lock().
Suppose we need this patch:

RCS file: /cvsroot/src/sys/ufs/ffs/ffs_vfsops.c,v
retrieving revision 1.341
diff -p -u -2 -r1.341 ffs_vfsops.c
--- ffs_vfsops.c        20 Oct 2016 19:31:32 -0000      1.341
+++ ffs_vfsops.c        12 Dec 2016 09:45:17 -0000
@@ -1918,5 +1918,6 @@ ffs_sync(struct mount *mp, int waitfor, 
        while ((vp = vfs_vnode_iterator_next(marker, ffs_sync_selector, &ctx)))
        {
-               error = vn_lock(vp, LK_EXCLUSIVE);
+               error = vn_lock(vp, LK_EXCLUSIVE |
+                   (waitfor == MNT_LAZY ? LK_NOWAIT : 0));
                if (error) {
                        vrele(vp);

Is it reproducible so you can test it?

--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig (Germany)



Home | Main Index | Thread Index | Old Index