tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: makesyscalls (moving forward)



On Mon, Jun 15, 2020 at 10:56:04PM +0000, David Holland wrote:
 > A less glib example: line 3186 of vfs_syscalls.c, in stat or more
 > precisely sys___stat50, has a handwritten call to copyout() to
 > transfer the stat results back to userspace.

To amplify:

Currently syscalls.master says:

439  STD  RUMP  { int|sys|50|stat(const char *path, struct stat *ub); }

Currently that generates the following objects related to handling
system calls:

   ------

struct sys___stat50_args {
        syscallarg(const char *) path;
        syscallarg(struct stat *) ub;
};

struct sysent sysent[] = {
        ...
{
        .sy_narg = sizeof(struct sys___stat50_args) / sizeof(register_t),
        .sy_argsize = sizeof(struct sys___stat50_args),
        .sy_flags = SYCALL_ARG_PTR,
        .sy_call = (sy_call_t *)sys___stat50
},              /* 439 = __stat50 */
        ...

   ------

and everything behond that is handwritten. The current structure of
the handwritten code is:

   sys___stat50 -> do_sys_statat -> namei and vn_stat

where sys___lstat50, sys_fstatat (recall that POSIX misnamed it, it's
what you'd expect to be called "statat") and several compat entry
points via do_sys_stat share do_sys_statat.

sys___stat50, sys___lstat50, sys_fstatat, and the various compat entry
points handle the copyout() for the stat buffer. The copyin() for the
path happens in do_sys_statat.

In a world where all this is generated, the entry point in
vfs_syscalls.c is do_sys_statat, and it receives only kernel
pointers. The system call table points to an autogenerated
sys___stat50 that looks something like this:

int
sys___stat50(struct lwp *l,
	const struct sys___stat50_args *uap,
	/*
		syscallarg(const char *) path;
		syscallarg(struct stat *) ub;
        */
	register_t *retval)
{
	struct pathbuf *pb;
	struct stat sb;
	int error;

	error = pathbuf_copyin(SCARG(uap, path), &pb);
	if (error) {
		return error;
	}

	error = do_sys_statat(l, AT_FDCWD, pb, FOLLOW, &sb);
	if (error) {
		pathbuf_destroy(pb);
		return error;
	}

	error = copyout(&sb, SCARG(uap, ub), sizeof(sb));

	pathbuf_destroy(pb);
	return error;
}

sys___lstat50 would be the same except that it passes NOFOLLOW, and so
would the various compat entry points except that they'd also contain
a call to translate the output stat structure before copying it out.

One disadvantage is that there's now one copy of the pathbuf_copyin
call for each entry point instead of a single shared one; but (a) I'm
not convinced this is important and (b) it could be avoided in the
code generator if we really wanted to.

However, there are a number of advantages:
   - userspace pointers are not exposed to or handled by handwritten
     code and the chances of mishandling them (which is a security
     issue) decreases sharply;
   - the code generator is far less likely to produce wrong or missing
     error path code;
   - once this is all in place, nobody needs to think about this code
     again.

The current syscalls.master can't express everything needed to do
this; I would expect the declaration to look something like

   err statat(fd fd, pathname path, struct stat *OUT ub, atflags flags);
   err stat(pathname path, struct stat *OUT ub) =
      statat(AT_FDCWD, path, ub, 0);

then for the table entry, something like

   439 nb50 rump stat

not to mention

   38 compat43 modular stat
   188 compat12 modular stat
   278 compat30 nb13 modular stat
   387 compat50 nb30 modular rump stat

and in the compat_ultrix table,

   38 ultrix modular stat

as there have been quite a few versions of stat over the years.

Note that it distinguishes "err" and the particular type of flags from
"int", knows what a pathname is, etc., all of which is necessary or at
least helpful for generating the code one wants in various
circumstances.

There are probably some wrinkles with the compat declarations that I'm
not on top of yet, but it'll become clear later. (This has to be done
by incrementally replacing handwritten with generated code, or it'll
never work properly; the system and the interactions of all the compat
stuff is too complicated.)

-- 
David A. Holland
dholland%netbsd.org@localhost


Home | Main Index | Thread Index | Old Index