On Jan 7, 2014, at 5:29 AM, David Holland <dholland-tech%netbsd.org@localhost>
wrote:
> On Tue, Dec 31, 2013 at 11:34:27AM +0100, J. Hannken-Illjes wrote:
>>>> The layered file systems hashlists currently have to work on locked
>>>> vnodes as the vnode operations returning vnodes return them locked.
>>>>
>>>> This leads to some very dirty hacks -- see layer_node_find() from
>>>> sys/miscfs/genfs/layer_subr.c for example.
>>>>
>>>> The best solution is to change these vnode operations to return
>>>> unlocked (and referenced) vnodes.
>>>>
>>>> The attached diff implements this change for operations create, mknod,
>>>> mkdir and symlink. It passes the atf test suite and some file system
>>>> stress tests I use here.
>>>>
>>>> Comments or objections anyone?
>>>
>>> My first thought is that I really don't think it's a good idea to
>>> change important semantic properties (particularly locking) of calls
>>> from what's effectively a public API without changing the signatures
>>> too so old code won't compile.
>>
>> Ok. Will add an API version argument to vnode_if.src so
>>
>> vop_mkdir {
>> + VERSION 1
>> IN LOCKED=YES WILLPUT struct vnode *dvp;
>>
>> will rename vop_mkdir_desc to vop_mkdir1_desc and then change all
>> file systems to use the new desc.
>
> While on the whole I suppose this is probably a good thing, it feels
> inelegant. Also, the ops table is typically a long way from the code
> that actually needs to be changed.
>
> Wouldn't it be better to attach the version number to the name of the
> args structure? Then it will fail to compile in the VOP function that
> needs attention. That is,
>
> int
> ufs_mkdir(void *v)
> {
> struct vop_mkdir_args /* {
> struct vnode *a_dvp;
> struct vnode **a_vpp;
> struct componentname *a_cnp;
> struct vattr *a_vap;
> } */ *ap = v;
> :
> :
> }
>
> would need to be changed to e.g.
>
> int
> ufs_mkdir(void *v)
> {
> struct vop_mkdir_args_v2 /* {
> struct vnode *a_dvp;
> struct vnode **a_vpp;
> struct componentname *a_cnp;
> struct vattr *a_vap;
> } */ *ap = v;
> :
> :
> }
>
> (also, while this is minor I think I'd rather have vop_mkdir_args_v2
> and/or vop_mkdir_desc_v2 rather than vop_mkdir2_args and
> vop_mkdir2_desc. The version doesn't need to be part of the operation
> name.)
Looks very good -- changed vnode_if.sh to create xxx_v2_args. Diff attached.
>>> Also, since what you're doing is basically unlocking before return in
>>> every implementation, and then relocking after return at every call
>>> site, it's possible to do one op at a time; it seems like that would
>>> be a good idea in case unexpected problems crop up.
>>
>> This is not what I'm doing.
>
> It still seems like a good idea to do one op at a time, or at least do
> one op first.
This is missing any arguments for the second time and you ignored my
arguments after "This is not what I'm doing.".
>>> Unlocking and relocking should not itself be a problem since AFAIK
>>> there isn't any reason for those locks to be held in the first place;
>>> however, I think I'd better go review all the call sites just in case.
>>>
>>> Plus I think it would be advisable to make this change first, so that
>>> in case we do find a site where it matters we can lock the returned
>>> object before anyone else can see it:
>>>
>>> #
>>> -#% mkdir dvp L U U
>>> +#% mkdir dvp L L L
>>
>> Where should it matter? VOP_MKDIR is only used from:
>>
>> external/cddl/osnet/dist/uts/common/fs/zfs/zfs_replay.c:zfs_replay_create()
>> sys/kern/vfs_syscalls.c:do_sys_mkdirat()
>> sys/nfs/nfs_serv.c:nfsrv_mkdir()
>> sys/fs/union/union_vnops.c:union_mkdir()
>> sys/fs/union/union_subr.c:union_mkshadow()
>> sys/rump/librump/rumpvfs/rumpvnode_if.c:RUMP_VOP_MKDIR()
>>
>> and I don't see any problems here.
>
> I have been putting off answering this because I ought to examine all
> these, and I haven't managed to do so yet but figured I ought to
> answer the rest.
An updated diff at http://www.netbsd.org/~hannken/vnode-pass2a-2.diff
Diffs to vnode_if.{sh,src} attached.
--
J. Hannken-Illjes - hannken%eis.cs.tu-bs.de@localhost - TU Braunschweig
(Germany)
Attachment:
vnode-pass2a-new.diff
Description: Binary data