Subject: New read & write syscalls
To: None <tech-kern@netbsd.org>
From: Bill Studenmund <wrstuden@nas.nasa.gov>
List: tech-kern
Date: 06/30/1999 11:16:34
In another part of the data migration system I'm working on, we need to be
able to get non-standard read & write behavior. For this system,
"standard" is to not permit access to the migrated part of a file. There's
obviously a problem when the restore daemon comes along and wants to
restore the missing part of a file - it has to be able to write what
processes normally can't.

We want multiple processes to be able to do the restoration simultaneously
(when a file is being restored from many tapes for instance), so just
registering a "magic process" isn't the path we want to go down.

What we want to do (well have done and want to merge into NetBSD) is to:

1) Add the idea of "alternate" read semantics. For our work, that's ignore
the residency checks. For a compressing layer, it'd be access the
compressed data rather than the uncompressed. As the difference is in a
bit set in the ioflags passed to the READ/WRITE VOP's, fs's which don't
have this concept ignore it.

An fs could, at its discression, restrict these semantics to the super
user as it saw fit.

2) Add new system calls to permit userland to assert this "alternate"
semantics bit. Right now we have read, pread, readv, and preadv (and write
equivalents). Rather than add 8 more calls to complete the permutations,
we want to add 2 new calls:

readwrite(int fd, void *buf, size_t nbyte, int flags, off_t offset)

and

readwritev(int fd, const struct iovec *iovp, int iovcnt, int flags, off_t
		offset)

The only difference between the two calls is that one takes a pointer to a
buffer and a byte count, while the other takes iovec's.

The call's behavior (read/write, positional/not, alternate/normal
semantics) is encoded in the flags:

RDWR_FILEOFFSET		0x01		Use the offset in the file
					descriptor, not offset
RDWR_ALT_SEM		0x02		Use alternate i/o semantics
RDWR_OTHER_1		0x04		fs-specific flag 1
RDWR_OTHER_2		0x08		fs-specific flag 2
RDWR_WRITE		0x10		Do a write

I have no clue what fs's would need with OTHER_1 & OTHER_2, but since 
I'm adding functionality it'd be stilly to build in limits. :-)

These would then be mapped to IO_ flags passed to the VOP.

The code to impliment these is pretty small. The unified diffs (not
including syscall defs) weigh in at about 11k.

Thoughts?

Take care,

Bill