NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/42348: lookup protocol use in do_sys_rename() is wrong



On Mon Nov 23 2009 at 08:20:04 +0000, David Holland wrote:
> The following reply was made to PR kern/42348; it has been noted by GNATS.
> 
> From: David Holland <dholland-bugs%netbsd.org@localhost>
> To: gnats-bugs%NetBSD.org@localhost
> Cc: 
> Subject: Re: kern/42348: lookup protocol use in do_sys_rename() is wrong
> Date: Mon, 23 Nov 2009 08:18:27 +0000
> 
>  On Thu, Nov 19, 2009 at 06:05:00PM +0000, pooka%iki.fi@localhost wrote:
>   > do_sys_rename() induces the final VOP_LOOKUP() for a file server
>   > twice.  This causes confusion in file system drivers, since they
>   > cannot e.g. allocate a memory slot without it being overridden
>   > by the next lookup (alternative is some weird heuristics where the
>   > driver attempts to detect the pattern).
>   > 
>   > What rename should arguably do is VOP_ABORTOP() right after the first
>   > one.  However, this is difficult since then the "real" lookup gets
>   > done without SAVESTART.  Also, performing the first lookup
>   > without SAVESTART is difficult, since then HASBUF may not get
>   > set correctly, i.e. the pathname buffer is lost.
>   > >How-To-Repeat:
>  
>  i've looked at this twice now (and also we talked about related stuff
>  the other day) and I still don't understand either what the problem is
>  or how the solution is supposed to fix it.
>  
>  If your FS wants to retain state from its VOP_LOOKUP for later use,
>  the most sensible way to do it is to store it in the directory vnode.
>  E.g. if you wanted to cache the offset of an unused directory entry,
>  you'd set it on every trip through lookup that saw one, and clear it
>  upon using it.
>  
>  This method doesn't even require VOP_ABORTOP for maintenance, although
>  I can see that if the state is large or expensive it might be
>  desirable to destroy it right away if it isn't going to be used,
>  instead of hanging around indefinitely until the next op on the same
>  directory.
>  
>  If on the other hand you want to retain data that is specifically
>  per-lookup, so the two lookups in rename should be logically separate,
>  then storing it in the vnode is fundamentally unworkable. The two
>  lookups in rename overlap and would break this; but also, so would any
>  two (or more) lookups happening concurrently in two (or more)
>  processes. Such data needs to be either returned through namei and
>  send back to the FS later, or stashed in curlwp, or something else.

This is not a discussion of the directions VOP_ABORTOP should or should
not take in the future.  It merely addresses the current insufficient
use in rename.  It's not really all that complicated: use abortop if
you do not call the vnodeop corresponding to your create/delete lookup.
Not using it correctly prevents relying on currently valid invariants.

>  Meanwhile, if you VOP_ABORTOP the results of one of the namei
>  operations right away, as in the suggested patch:
>  
>   :         NDINIT(&fromnd, DELETE, LOCKPARENT | SAVESTART | TRYEMULROOT,
>   :             seg, from);
>   :         if ((error = namei(&fromnd)) != 0)
>   :                 return (error);
>   :  +      VOP_ABORTOP(fromnd.ni_dvp, &fromnd.ni_cnd);
>  
>  it'll result in free twice later on, when the rename operation on
>  fromnd.ni_dvp either succeeds or ends up calling VOP_ABORTOP again.

free twice what exactly?


Home | Main Index | Thread Index | Old Index