Subject: Re: RFC: client-side NFS locking
To: None <efnbl06@bn2.maus.net>
From: YAMAMOTO Takashi <yamt@mwd.biglobe.ne.jp>
List: tech-kern
Date: 09/16/2006 20:30:55
hi,
> I desperately need NFS client-side locking on our (math.uni-bonn.de)
> mail servers. I was so happy they're running NetBSD now, but still no
> NFS locking in 3.0 or current. After a plea to ws@ to port what's in
> FreeBSD failed with ENOTIME, I spent most of a weekend reading a thick
> book with a red cover featuring a funny creature carrying a fork, phoned
> ws umpteen times the following weeks and wrote something you can find at
> http://www.math.uni-bonn.de/people/ef/nfslock.tar.gz. The tarball contains
> a file full of patches and several new files. I effectively split
> lockd_lock.c into lock_server.c and lock_common.c, added some lines to them
> and a new lock_client.c---plus a kernel part or course. It's all quite
> different from what BSDI/FreeBSD does.
thanks for taking a look at this, and sorry for very late reply.
> I'd prefer to do it event-driven, but how do you call
> RPCs event-driven?
no way with our rpc library, afaik.
> In vnode.h, why doesn't VOPARG_OFFSET use offsetof()?
i don't think there is a fundamental reason. i guess it's merely historical.
> In vnodeops(9), under VOP_ADVLOCK, I don't understand the wording after
> "The argument".
fixed.
> In kern_descrip.c, the code silently adds the current file position
> without changing SEEK_CUR to SEEK_SET. Moreover, this behaviour seems
> to be undocumented.
documented.
> In the same file, in sys-flock(), lf.pid seems to be uninitialized.
> No clue whether that is problematic.
it shouldn't be a problem because l_pid member is output only.
ie. it's only used for F_GETLK.
> In rpc.lockd/lockd_lock.c, a host was never unmonitored.
you are right. it lacks unmonitoring.
> In rpc/lockd/lock_proc.c, getclient(), the comment talks, err, writes
> about -udp- where the code uses -tp-.
i'm not sure what you mean here.
> I've read different opinions whether fcntl(..., F_SETLK, ...) should
> return EACCESS or EAGAIN when it can't get the lock.
our local filesystems return EAGAIN. SUSv3 seems to allow both.
> As I wrote above, I never used RPCs before, but I thought the whole point
> about the async versions like LOCK_MSG was that they returned immediately,
> i.e. as soon as the arguments have arrived. However, lock_proc.c calls the
> handling routine, sends the reply RPC (LOCK_MSG or alike) and only then
LOCK_RES, you mean?
> returns. I would argue that as long as my call to LOCK_MSG hasn't returned,
> I haven't made the call so there can be no LOCK_RES referring to the call
> I didn't yet issue. Moreover, I had to thread the client because of this.
do you mean that the current behaviour like the following:
LOCK_MSG request ->
<- LOCK_RES request
LOCK_RES reply ->
<- LOCK_MSG reply
should be:
LOCK_MSG request ->
<- LOCK_MSG reply
<- LOCK_RES request
LOCK_RES reply ->
?
i tend to agree. but it's better to deal with such servers anyway.
with our rpc library interface, i think it requires some kind of
threading. (well, instability of our pthread and thread-safeness of
our libraries might be problems, tho...)
> The NLM server always seems to fhopen() RDWR, meaning one can't lock
> files root can't write to.
a good point. i'm not sure how often it can be a problem actually, tho.
> Moreover, the server seems to be only able to handle one single lock per
> file, probably because keeping track of different processes locking
> different parts of the file isn't much fun. I'm not sure I want to
> rewrite this.
yes, it's a big problem.
> Maybe it would be easier if the kernel exposed an interface
> at the lf_xxx() level including some sort of callback if a lock becomes
> available?
it sounds reasonable.
to implement nlm server properly, we need the ability to specify
a remote lock owner.
(an alternative would be moving lockd into kernel. :)
> There is a typo in rpc(3) reading "rpc_reg structure" instead of
> "rpc_req" (that's in bold so / doesn't find it).
fixed. (it was svc_reg/req, not rpc_.)
> In rpc/rpc.h, why does clnt_call() cast the pointer to char * and
> clnt_freeargs() doesn't?
do you mean clnt_freeres? i'm not sure.
i always feel these prototypes should have been void *.
YAMAMOTO Takashi