Subject: Re: kern/30831
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: Bill Stouder-Studenmund <wrstuden@netbsd.org>
List: netbsd-bugs
Date: 04/02/2007 21:05:03
The following reply was made to PR kern/30831; it has been noted by GNATS.

From: Bill Stouder-Studenmund <wrstuden@netbsd.org>
To: Antti Kantee <pooka@cs.hut.fi>
Cc: gnats-bugs@NetBSD.org, deadbug@gmail.com, wrstuden@netbsd.org,
	chs@netbsd.org
Subject: Re: kern/30831
Date: Mon, 2 Apr 2007 14:02:53 -0700

 --dDRMvlgZJXvWKvBx
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline
 Content-Transfer-Encoding: quoted-printable
 
 On Mon, Apr 02, 2007 at 11:07:13PM +0300, Antti Kantee wrote:
 > Ok, here's my theory.
 >=20
 > On Mon Apr 02 2007 at 15:10:07 +0000, Patrick Welche wrote:
 > >  (gdb) frame 8
 > >  #8  0xc0216ec2 in smbfs_sync (mp=3D0xc14ca000, waitfor=3D3, cred=3D0xc=
 ad40ee0,=20
 > >      l=3D0xcad4f7a0) at ../../../../fs/smbfs/smbfs_vfsops.c:460
 > >  460                     if ((vp->v_type =3D=3D VNON || (np->n_flag & N=
 MODIFIED) =3D=3D 0) &&
 > >  (gdb) print *vp
 > >  $1 =3D {v_uobj =3D {vmobjlock =3D {lock_data =3D 1 '\001', lock_pad =
 =3D "\000\000",=20
 > >        lock_file =3D 0xc056aa8c "../../../../fs/smbfs/smbfs_vfsops.c",=
 =20
 > >        unlock_file =3D 0xc0577f7c "../../../../miscfs/genfs/genfs_vnops=
 .c",=20
 > >        lock_line =3D 457, unlock_line =3D 1081, list =3D {tqe_next =3D =
 0x0,=20
 > >          tqe_prev =3D 0xc05a29d0}, lock_holder =3D 0}, pgops =3D 0xc059=
 e9ec, memq =3D {
 > >        tqh_first =3D 0x0, tqh_last =3D 0xcf110988}, uo_npages =3D 0, uo=
 _refs =3D 0},=20
 > >    v_size =3D 0, v_flag =3D 256, v_numoutput =3D 0, v_writecount =3D 0,=
  v_holdcnt =3D 0,=20
 > v_flag =3D VXLOCK
 >=20
 > >    v_mount =3D 0xc14ca000, v_op =3D 0xc118a400, v_freelist =3D {
 > >      tqe_next =3D 0xcd5a9ae4, tqe_prev =3D 0xdeadb}, v_mntvnodes =3D {
 >=20
 > note tqe_prev =3D 0xdeadb
 >=20
 > >    v_tag =3D VT_SMBFS, v_lock =3D {lk_interlock =3D {lock_data =3D 0 '\=
 0',=20
 >=20
 > v_tag =3D VT_SMBFS
 >=20
 > >      lk_lock_file =3D 0xc0577f7c "../../../../miscfs/genfs/genfs_vnops.=
 c",=20
 > >      lk_unlock_file =3D 0xc0577f7c "../../../../miscfs/genfs/genfs_vnop=
 s.c",=20
 > >      lk_lock_line =3D 298, lk_unlock_line =3D 314}, v_vnlock =3D 0xcf11=
 09f0,=20
 > >    v_data =3D 0x0, v_klist =3D {slh_first =3D 0x0}}
 >=20
 > v_data =3D NULL
 >=20
 > So, this vnode is clearly being recycled.  Now, I can't figure out
 > anything that would protect a vnode between vfs_sync (just takes the
 > interlock) and VOP_RECLAIM (called unlocked).  And it seems like
 > VOP_RECLAIM in smbfs can block *after* setting v_data to NULL and
 > otherwise nuking out the vnode:
 >=20
 >         vp->v_data =3D NULL;
 >         smbfs_hash_unlock(smp);
 >         if (np->n_name)
 >                 smbfs_name_free(np->n_name);
 >         pool_put(&smbfs_node_pool, np);
 >         if (dvp) {
 >                 vrele(dvp);
 >                 /*
 >                  * Indicate that we released something; see comment
 >                  * in smbfs_unmount().
 >                  */
 >                 smp->sm_didrele =3D 1;
 >         }
 >         return 0;
 >=20
 > vrele() calls VOP_INACTIVE which calls all kinds of nasty stuff.  So,
 > my theory is that the vnode is being reclaimed, sleeping on the final
 > leg, and in comes sync(), which goes through all the vnodes and meets
 > the one which has been half-reclaimed now.  Other file systems get
 > lucky because they don't sleep in reclaim.  No biglock is going to
 > be yummy.
 >=20
 > Bill, Chuck: This sounds a bit scary, but does it make any sense at all
 > or did I miss something obvious?
 
 No, I think you got it right.
 
 Yes, w/o big lock, things may be painful.
 
 I think we need to work on how we handle vnode recycling. I think vnode=20
 death is a more-painful issue than the code assumes. Well, more painful=20
 than when the code we have was written.
 
 I think the immediate thing is to have smbfs_sync() cope with np =3D=3D NUL=
 L=20
 vnodes. They don't need syncing. Or unhook the vnode from the mount point=
 =20
 sooner, but I don't think we can do that as the file system doesn't manage=
 =20
 that list. :-)
 
 So v_data =3D=3D NULL would probably be the test for now.
 
 Take care,
 
 Bill
 
 --dDRMvlgZJXvWKvBx
 Content-Type: application/pgp-signature
 Content-Disposition: inline
 
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.3 (NetBSD)
 
 iD8DBQFGEW99Wz+3JHUci9cRAunuAJ4hCfO20QVPgvw3RUB6cfcnhmoOhACeJhYW
 ddL6IaA7PZljS4uAuliEmEM=
 =A96P
 -----END PGP SIGNATURE-----
 
 --dDRMvlgZJXvWKvBx--