Subject: Re: VOP locking issue
To: Antti Kantee <pooka@cs.hut.fi>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: tech-kern
Date: 11/14/2006 00:01:07
--ew6BAiZeqk4r7MaW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Tue, Nov 14, 2006 at 12:28:58AM +0200, Antti Kantee wrote:
> > Thanks. But what does lock the vnode in my case then ? Or would VOP_UNLOCK()
> > not complain if it's called on an unlocked vnode ? Note that the above
> > code works most of the time; I've got only one report of failure.
>
> The complement of the theory: checkalias() in bdevvp() does not match
> and you get a VT_NON/specfs vnode, for which locking operations are nops
> and VOP_UNLOCK() doesn't do lockmgr at all.
>
> If checkalias() does match, someone else using that vnode might have the
> lock and you are trying to release it on that someone's behalf, hence
> you trigger "if (WEHOLDIT(lkp, pid, lid, cpu_num) == 0)" in lockmgr.
> I think the other possibility would be to have a panic for releasing an
> unlocked vnode if timing is different.
OK, thanks for the explainations. As you're familiar with this could you
tell me if the attached patch looks good ? Especially I'm not sure
about the vrele(), I don't know if bdevvp() will vref() the vnode (I took
this from dkwedge/dk.c).
BTW, my code never calls vput() or vrele() so I don't know what
decrements v_usecount for normal exit (vn_close() maybe?)
>
> Getting vnode locking wrong, especially for corner cases and
> error-branches, seems all too common. This IMHO speaks more about vnode
> locking than the people using it.
I won't say it's not hard to get it :)
BTW, these files:
dev/ata/ata_raid_adaptec.c
dev/ata/ata_raid_promise.c
dev/ata/ld_ataraid.c
dev/raidframe/rf_netbsdkintf.c
seems to not do proper vn_lock() before VOP_OPEN().
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--ew6BAiZeqk4r7MaW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: xbdback.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/xen/xbdback.c,v
retrieving revision 1.21
diff -u -1 -0 -r1.21 xbdback.c
--- xbdback.c 15 May 2006 09:22:34 -0000 1.21
+++ xbdback.c 13 Nov 2006 22:54:34 -0000
@@ -540,20 +540,29 @@
goto end;
}
error = bdevvp(vbd->dev, &vbd->vp);
if (error) {
printf("xbdback VBD grow domain %d: can't open "
"device 0x%x (error %d)\n", xbdi->domid,
req->extent.device, error);
req->status = BLKIF_BE_STATUS_EXTENT_NOT_FOUND;
goto end;
}
+ error = vn_lock(vbd->vp, LK_EXCLUSIVE | LK_RETRY);
+ if (error) {
+ printf("xbdback VBD grow domain %d: can't lock "
+ "device 0x%x (error %d)\n", xbdi->domid,
+ req->extent.device, error);
+ vrele(vbd->vp);
+ req->status = BLKIF_BE_STATUS_EXTENT_NOT_FOUND;
+ goto end;
+ }
error = VOP_OPEN(vbd->vp, FREAD, NOCRED, 0);
if (error) {
printf("xbdback VBD grow domain %d: can't open2 "
"device 0x%x (error %d)\n", xbdi->domid,
req->extent.device, error);
vput(vbd->vp);
req->status = BLKIF_BE_STATUS_EXTENT_NOT_FOUND;
goto end;
}
VOP_UNLOCK(vbd->vp, 0);
Index: xbdback_xenbus.c
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/xen/xbdback_xenbus.c,v
retrieving revision 1.2
diff -u -1 -0 -r1.2 xbdback_xenbus.c
--- xbdback_xenbus.c 15 Oct 2006 21:34:48 -0000 1.2
+++ xbdback_xenbus.c 13 Nov 2006 22:54:34 -0000
@@ -586,20 +586,27 @@
printf("xbdback %s: no bdevsw for device 0x%x\n",
xbusd->xbusd_path, xbdi->xbdi_dev);
return;
}
err = bdevvp(xbdi->xbdi_dev, &xbdi->xbdi_vp);
if (err) {
printf("xbdback %s: can't open device 0x%x: %d\n",
xbusd->xbusd_path, xbdi->xbdi_dev, err);
return;
}
+ err = vn_lock(vbd->vp, LK_EXCLUSIVE | LK_RETRY);
+ if (err) {
+ printf("xbdback %s: can't vn_lock device 0x%x: %d\n",
+ xbusd->xbusd_path, xbdi->xbdi_dev, err);
+ vrele(vbd->vp);
+ return;
+ }
err = VOP_OPEN(xbdi->xbdi_vp, FREAD, NOCRED, 0);
if (err) {
printf("xbdback %s: can't VOP_OPEN device 0x%x: %d\n",
xbusd->xbusd_path, xbdi->xbdi_dev, err);
vput(xbdi->xbdi_vp);
return;
}
VOP_UNLOCK(xbdi->xbdi_vp, 0);
err = VOP_IOCTL(xbdi->xbdi_vp, DIOCGPART, &dpart, FREAD, 0, NULL);
if (err) {
--ew6BAiZeqk4r7MaW--