tech-userlevel archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: eventfd(2) and timerfd(2) APIs

> On Sep 18, 2021, at 3:35 PM, Robert Elz <kre%munnari.OZ.AU@localhost> wrote:
>    Date:        Sat, 18 Sep 2021 13:21:27 -0700
>    From:        Jason Thorpe <>
>    Message-ID:  <>
>  | >  unless the
>  | >  .Nm
>  | >  object was created with
>  | >  .Dv TFD_NONBLOCK .
>  |
>  | I'm using those names, because those are the names used in the Linux API.
> It wasn't the names I was concerned about.
>  | If you look at the code (it's on the thorpej-futex branch),
>  | you will see that they are aliases for O_NONBLOCK and O_CLOEXEC.
> That was kind of obvious anyway from the man page:
>  The following flags define the behavior of the resulting object:
>  .Bl -tag -width "EFD_SEMAPHORE"
>  Sets the
>  flag; see
>  .Xr open 2
>  for more information.
> So:
>  | I will clarify this in the man page.
> probably isn't really necessary.   I was more concerned with the
> "unless the object was created with" - implying that if those flags
> are changed later, that would be irrelevant, as it is the state at
> create time that matters.   That would be unfortunate indeed, but:

I’ve changed the man pages to state “set for non-blocking I/O”.

>  | Actually, I didn't plumb fcntl through because just about nothing
> might explain part of that (though you can't avoid the ability to
> alter O_CLOEXEC that way, as that's a much higher level operation).

>  | else plumbs it through either, but I'll go ahead and do so.
> Please do.   What other things don't permit fcntl() to work?   We
> should fix any of those.

Well, I’ll fix these 2 anyway.

>  | The behavior of timerfd with respect to read is documented in my man page:
> Yes, I saw that.
>  | Writes to a timerfd return an error.  I will clarify this in the man page.
> That would be useful.   You might want to also indicate how these
> descriptors are destroyed (I assume just close(2) but who knows).

Yes, they’re file descriptors, so close(2) gets rid of them.  Does this really need to be stated explicitly?

>  | > Finally, what does fstat() return about these fds?
> The one I should have asked about, but forgot, was (st_mode & _S_FMT)
> Ie: what kind of object are these things pretending to be?

static int 
timerfd_fop_stat(file_t * const fp, struct stat * const st)
        struct timerfd * const tfd = fp->f_timerfd;
        memset(st, 0, sizeof(*st)); 
        st->st_size = (off_t)timerfd_fire_count(tfd);
        st->st_atimespec = tfd->tfd_atime;
        st->st_mtimespec = tfd->tfd_mtime;

        st->st_blksize = sizeof(uint64_t);
        st->st_mode = S_IFIFO | S_IRUSR | S_IWUSR;
        st->st_blocks = 1;
        st->st_birthtimespec = st->st_ctimespec = tfd->tfd_btime;
        st->st_uid = kauth_cred_geteuid(fp->f_cred);
        st->st_gid = kauth_cred_getegid(fp->f_cred);
        return 0;

eventfd is similar.

> Since they're fd's, they can be inherited, open, by other processes
> (and since the man page hints at it, probably sent through a AF_UNIX
> socket), but particularly in the former case, the receiving process
> needs to know (or at least be able to find out) what it is that is on
> this fd it has received.
>  | Of course, we don't document what these are for other kinds of descriptors,
> for many there's no need, as everything is exactly what stat(2) claims
> it will be.   For any where that is not true, or is insufficient, we
> should be documenting it.

There are, of course, not enough _S_FMT bits to describe the possible combinations.

> If this was just a linux compat hack, so linux binaries could run,
> then most of this wouldn't matter - the application would do whatever
> linux allows it to do, and nothing actually built on NetBSD would
> ever care.
> But if these are to be full NetBSD interfaces, they need to be
> both complete (and sane) and properly documented.   That means
> which of the f*() interfaces (fstat, fchmod, fchown, ...) work,

Actually, fchmod(), fchown(), etc. only work on DTYPE_VNODE descriptors.  You’ll get EBADF if you try it on anything else (look for any place that calls fd_getvnode()).

-- thorpej

Home | Main Index | Thread Index | Old Index