On Dec 31, 2013, at 11:34 AM, J. Hannken-Illjes
<hannken%eis.cs.tu-bs.de@localhost> wrote:
> On Dec 30, 2013, at 10:58 PM, David Holland
> <dholland-tech%netbsd.org@localhost> wrote:
>
>> On Mon, Dec 30, 2013 at 11:35:48AM +0100, J. Hannken-Illjes wrote:
>>> The layered file systems hashlists currently have to work on locked
>>> vnodes as the vnode operations returning vnodes return them locked.
>>>
>>> This leads to some very dirty hacks -- see layer_node_find() from
>>> sys/miscfs/genfs/layer_subr.c for example.
>>>
>>> The best solution is to change these vnode operations to return
>>> unlocked (and referenced) vnodes.
>>>
>>> The attached diff implements this change for operations create, mknod,
>>> mkdir and symlink. It passes the atf test suite and some file system
>>> stress tests I use here.
>>>
>>> Comments or objections anyone?
>>
>> My first thought is that I really don't think it's a good idea to
>> change important semantic properties (particularly locking) of calls
>> from what's effectively a public API without changing the signatures
>> too so old code won't compile.
>
> Ok. Will add an API version argument to vnode_if.src so
>
> vop_mkdir {
> + VERSION 1
> IN LOCKED=YES WILLPUT struct vnode *dvp;
>
> will rename vop_mkdir_desc to vop_mkdir1_desc and then change all
> file systems to use the new desc.
>
>> Also, since what you're doing is basically unlocking before return in
>> every implementation, and then relocking after return at every call
>> site, it's possible to do one op at a time; it seems like that would
>> be a good idea in case unexpected problems crop up.
>
> This is not what I'm doing.
>
> Many file systems use one internal function to create a new inode and
> where possible I change this function to return the node unlocked.
>
> Relocking is done only for nfs server, vn_open() and temporarily for
> layered file systems. The operations from vfs_syscalls.c all change
> vput() to vrele().
>
>> Unlocking and relocking should not itself be a problem since AFAIK
>> there isn't any reason for those locks to be held in the first place;
>> however, I think I'd better go review all the call sites just in case.
>>
>> Plus I think it would be advisable to make this change first, so that
>> in case we do find a site where it matters we can lock the returned
>> object before anyone else can see it:
>>
>> #
>> -#% mkdir dvp L U U
>> +#% mkdir dvp L L L
>
> Where should it matter? VOP_MKDIR is only used from:
>
> external/cddl/osnet/dist/uts/common/fs/zfs/zfs_replay.c:zfs_replay_create()
> sys/kern/vfs_syscalls.c:do_sys_mkdirat()
> sys/nfs/nfs_serv.c:nfsrv_mkdir()
> sys/fs/union/union_vnops.c:union_mkdir()
> sys/fs/union/union_subr.c:union_mkshadow()
> sys/rump/librump/rumpvfs/rumpvnode_if.c:RUMP_VOP_MKDIR()
>
> and I don't see any problems here.
Put an updated diff to http://www.netbsd.org/~hannken/vnode-pass2a-1.diff
This diff adds API version, diffs to vnode_if.{sh,src} attached.
Further comments or objections anyone?
--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
(Germany)
Attachment:
vnode-pass2a-new.diff
Description: Binary data