Re: locking against myself in unionfs code

To: tech-kern%netbsd.org@localhost
Subject: Re: locking against myself in unionfs code
From: David Holland <dholland-tech%netbsd.org@localhost>
Date: Tue, 2 Sep 2025 21:32:44 +0000

On Tue, Sep 02, 2025 at 07:11:13PM +0200, Edgar Fu? wrote:
 > I got a "loking against myself" panic when writing to a mount_union'ed fs:

Not surprising, unfortunately...

 > [trace based on link()]

I think I have found the problem. First, some context:

 > relookup() at netbsd:relookup+0x42
 > union_link() at netbsd:union_link+0x125
 > VOP_LINK() at netbsd:VOP_LINK+0x46
 > do_sys_linkat() at netbsd:do_sys_linkat+0x20a

So what's going on here is that we've already looked up the target of
the link, and called into VOP_LINK with the directory and name info
for the new name. And for some onionfs reason it thinks it needs to
redo that lookup.

This means, however, that the new name doesn't exist at the point we
call into VOP_LINK, and, furthermore (absent serious fs corruption)
the name isn't "." or ".." so ISDOTDOT can't be set.

 > vn_lock() at netbsd:vn_lock+0x22
 > union_lookup1() at netbsd:union_lookup1+0x68
 > union_lookup() at netbsd:union_lookup+0xa4

There are two calls to union_lookup1(), one for the upper layer and
one for the lower layer, and we could be on either.

There are also three different calls to vn_lock() in union_lookup1(),
but two of them require ISDOTDOT so it can't be those. Rather, what's
happened is that it looked the new name up, found something, tried to
lock it, and found that we already have it locked.

At this point there _should_ be only one thing we have locked (the
directory we're looking in), and even though because it's onionfs
there are up to three vnodes for it and we might have any or all of
them locked, there shouldn't be any way a name pointing to any of them
could have been inserted since we were in do_sys_linkat().

However, union_link does the following:

        /*
         * Needs to be copied before we can link it.
         */
        vn_lock(ap->a_vp, LK_EXCLUSIVE | LK_RETRY);
        if (droplock)
                VOP_UNLOCK(dun->un_uppervp);
        error = union_copyup(un, 1, cnp->cn_cred, curlwp);

that is, it locks the target vnode (the one we're linking) in order to
copy it up.

It then blithely leaves it locked while calling relookup. So now
there's another vnode locked, and that's almost certainly the one we
stepped on: another process did the same link() operation, so in the
relookup the new name already exists and points to the same target
vnode, so trying to lock it a second time panics.

It looks like this code can also deadlock: trying to lock
dun->uppervp, which is the upper-layer vnode for the directory we're
looking in, while holding a lock for the target vnode, is an order
inversion in the common case where the link operation is entirely
within the same directory.

I don't see any obvious reason that this relock/relookup logic
following the union_copyup call (in union_lock) needs to be done while
holding a lock on the target vnode, and there are a number of reasons
it's wrong/unsafe, so I think the unlock call on line 1299 should be
moved up.

I'll do that, because even if there's some consideration I'm missing
it's unlikely to make things any worse.

-- 
David A. Holland
dholland%netbsd.org@localhost

Follow-Ups:
- Re: locking against myself in unionfs code
  - From: Edgar Fuß

References:
- locking against myself in unionfs code
  - From: Edgar Fuß

Prev by Date: locking against myself in unionfs code
Next by Date: Re: locking against myself in unionfs code
Previous by Thread: locking against myself in unionfs code
Next by Thread: Re: locking against myself in unionfs code
Indexes:

Home | Main Index | Thread Index | Old Index