tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
fstrans_start(vp->v_mount) and vgone
Last year hannken@ changed fstrans(9) to use a hash table mapping
struct mount pointers to fstrans info, in order to fix a crash found
by syzkaller arising from a race between fstrans_start(vp->v_mount)
and vgone(vp):
https://syzkaller.appspot.com/bug?extid=54dc9ac0804a97b59bc6
https://mail-index.netbsd.org/source-changes/2022/07/08/msg139622.html
However, while this may avoid a _crash_ from use-after-free of
vp->v_mount->mnt_..., I'm not sure this is enough to prevent a _race_.
Suppose we have the following sequence of events:
lwp0 lwp1
---- ----
mp = vp->v_mount;
vgone(vp);
-> vp->v_mount = deadfs
unmount(mp)
-> free(mp)
mount a new file system
-> malloc() returns the same mp
fstrans_start(mp);
Now lwp0 has started a transaction on some totally unrelated mount
point -- and may take confused actions under the misapprehension that
vp is a vnode that unrelated mount point.
I think in order to fix this properly, we really need an operation
that atomically (a) loads vp->v_mount, and (b) starts an fstrans on
it.
It's not necessary to use a hash table indirection for this -- we can
just use pserialize and ensure that struct mount objects aren't
actually freed until all pserialize read sections are completed.
Something like this:
struct mount *
fstrans_start_vnode(struct vnode *vp)
{
struct mount *mp;
struct fstrans_lwp_info *fli;
int s;
s = pserialize_read_enter();
mp = atomic_load_relaxed(&vp->v_mount);
fli = fstrans_get_lwp_info(mp, /*alloc*/false);
if (fli == NULL) {
pserialize_read_exit(s);
/* XXX preallocate fli and retry with alloc=true */
}
if (fli->fli_trans_cnt > 0) {
fli->fli_trans_cnt++;
return mp;
}
fmi = fli->fli_mountinfo;
if (__predict_true(grant_lock(fmi, ...))) {
...
}
...
}
And fstrans_mount_dtor can do pserialize_perform (or just xc_barrier)
to wait for all these pserialize read sections to complete before
freeing mp.
Then, any code that currently does
mp = vp->v_mount;
fstrans_start(mp);
or equivalent, can be rewritten to do:
mp = fstrans_start_vnode(vp);
Perhaps we can even forbid direct access to vp->v_mount to catch any
mistakes of this class -- and if there's a lot of access to it that
doesn't match this pattern, we could add a vnode_mount(vp) accessor to
assert that the caller holds an fstrans.
Thoughts?
Home |
Main Index |
Thread Index |
Old Index