Source-Changes-D archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys/kern



On 2022-10-01 3:39 pm, Robert Elz wrote:
Date:        Sat, 1 Oct 2022 13:00:04 -0400
[stuff deleted]

Even when it is called, the code is:

	fp->f_flag = flags & FMASK;

where FMASK is (from fcntl.h)

#define FMASK           (FREAD|FWRITE|FCNTLFLAGS|FEXEC)

and

#define FCNTLFLAGS (FAPPEND|FASYNC|FFSYNC|FNONBLOCK|FDSYNC|FRSYNC|FALTIO|\
                         FDIRECT|FNOSIGPIPE)

which all looks exactly as it should be to me - and note that O_CLOEXEC
(which has no Fxxxx equivalent name - it doesn't need one) is not there.

So, fp->f_flag isn't set at all (is probably 0), and even if it were
O_CLOEXEC would not be there, and should not be.

Thanks for pointing that out, I always forget the O_CLOEXEC is special
in that regard. I wish it was not, but it is difficult to fix.


For the vnode actually being opened, none of this matters, as the open
lasts as long (actually not as long) as the open() sys call - when that
returns, the device being opened has been closed again, so what the
file that refers to it looks like (or would have looked like) really
doesn't matter at all, it never becomes visible.   That's my guess as
to why the open_setfp() call is missing in that case.

But what got forgotten (or deliberately was not done) was anything to
affect the modes of the clone which is opened.

  | What does it mean when the open specifies O_CLOEXEC
  | and ff->ff_exclose is false? Can that happen? Is that desirable?

It is what is currently happening whenever we open a cloning device.
(Or that is what it looks like to me).   Desirable, no idea, I didn't
define the semantics of cloning device opens, nor which of the open
flags should apply to the clone that is created.

| I am fine with the locking to stay where it is. I guess it is probably
  | not needed after dup/clone, since the underlying vnode is shared...

Assuming you mean dup(2) (and dup2()) and clone(2) then no - those have
no way to pass the relevant locking flags, if you have a fd and want to
apply a lock, fcntl() is what does that (and the fcntl operations that
duplicate fds do not also apply locks).

The only issue is O_EXLOCK and O_SHLOCK (and O_CLOEXEC) for cloned devices.

I suspect it makes sense for O_CLOEXEC to be applied in that case, it
makes little sense for open("/dev/ptmx", O_RDWR|O_CLOEXEC) to succeed,
returning an open fd which does not have cloexec set (which is the issue,
along with O_NONBLOCK) which started all of this discussion.

The locking flags I am less sure about.   I don't see how they can fail
to succeed if applied, as the vnode for the device has just been created, nothing else can possibly have any kind of lock on it. Whether there's
any benefit in applying the lock so that the node is locked for later,
I don't know - but it certainly should do no harm to do that.

It seems clear to me that what we need is (something like)

Index: vfs_syscalls.c
===================================================================
RCS file: /cvsroot/src/sys/kern/vfs_syscalls.c,v
retrieving revision 1.555
diff -u -r1.555 vfs_syscalls.c
--- vfs_syscalls.c	12 Feb 2022 15:51:29 -0000	1.555
+++ vfs_syscalls.c	1 Oct 2022 19:27:15 -0000
@@ -1763,6 +1763,9 @@
 		error = fd_dupopen(dupfd, dupfd_move, flags, &indx);
 		if (error)
 			return error;
+		error = open_setfp(l, fp, XXXvp, indx, flags);
+		if (error)
+			return error;
 		*fd = indx;
 	} else {
 		error = open_setfp(l, fp, vp, indx, flags);


where XXXvp needs to be extracted from somewhere (it isn't vp, as vp==NULL)
except that what follows in the else case is ...

                if (error)
                        return error;
                VOP_UNLOCK(vp);
                *fd = indx;


That VOP_UNLOCK(vp) is what is bothering me, It tells me that open_setfp() is expecting to be called with vp locked - but in the first case (the cloning case) there is no VOP_UNLOCK() call (and what's more, fd_dupopen() cannot do it, as it has no vp arg). That means, I believe, that when vn_open() returns in the normal case, vp is returned locked, but in the cloning case
the vnode that was created for the clone is not locked.

I'm not sure what is the right way to find the vnode, or how it should
properly be locked so open_setfp() can do its thing.   If I knew all of
that I would have made an attempt at fixing this already.   We need
someone who really understands what is happening here, and the right
way to handle it all (which very likely is nothing like I just suggested).

The question is how to find the vnode? Perhaps it is easiest to fail the
open call if O_EXLOCK or O_SHLOCK are specified in a cloning open? At least
we will not silently ignore them?

Best,

christos


Home | Main Index | Thread Index | Old Index