tech-kern: Re: fsync performance hit on 1.6.1

Subject: Re: fsync performance hit on 1.6.1
To: Greywolf <greywolf@starwolf.com>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 07/10/2003 03:06:00
[ On Wednesday, July 9, 2003 at 14:06:10 (-0700), Greywolf wrote: ]
> Subject: Re: fsync performance hit on 1.6.1
>
> The difference, though, is that ftok() is implemented in userspace, forcing
> whatever uses IPC to do its own groveling, while namei() is never seen from
> userland -- the kernel always ALWAYS handles namei().

And this difference means what, exactly?  Have you never heard of any
systems which implement filesystems in userland?  A little bird tells me
you can even do such a thing in modern *BSD systems.

>  Using IPC shouldn't require kernel-like knowledge to address any more
> than one is required to know kernel-like things in order to call,
> e.g. open(2) or chdir(2).

You absolutely do not need any "kernel-like" knowledge to understand or
use SysV IPC mechanism.  If you think that is true then what you're
probably missing is generic knowledge of interprocess communications
techniques, as well as perhaps a general understanding of how various
naming conventions are implemented.

Would you say that using the DNS requires "kernel-like" knowledge too?
How about other kinds of naming schemes?

> With the IPC stuff, you have to jump through quite a few hoops to locate
> the identifier.  You must ALWAYS do this.

"one" != "quite a few"

I guess maybe you don't like to open your files before you read them?
Perhaps you'd rather not have to bother managing your open file
descriptors either?

>  You cannot hope to even generate
> the identifier on the fly.

Why not?  If you know its name then you can find its resource ID with
trivial ease (in very much the same way you would know a file's name
since they are after all exactly the same things, and indeed you can
even use normal filesystem tools to examine a list of filenames which
might have been associated with IPC resources to select from amongst
them).

Or do you mean you were so totally overwhelmed by the flexibility of the
API that you forgot to make appropriate use one of its key features in
the first place?

>  At this point, error detection and handling
> come into play, and there's really no fallback -- if what the system
> gives you doesn't work for some strange reason, you're hosed (of course,
> so's the system, most likely).

So, what do you do when you open a file that doesn't exist?  What if
your filesystem wasn't even mounted?

The system is never hosed because of bugs in applications using SysV IPC
-- at least not so long as it's being managed by anyone with a clue, and
provided of course there are no latent bugs in the implementation.

I'm sorry to have to say this but it seems as if your complaints are
full of hot air and ignorance.

> Nonetheless, that the implementation as it stands deals with nonhuman(oid)-
> readable identifiers which must be obtained through secondary means is just
> ridiculous.

Hmmm... sounds like you're equating inode numbers, or maybe file
descriptors, or some similar kind of resource handle, to filenames.  Do
you find it too confusing to have the ability to have a separate (and
potentially user-replacable) name-to-key mapping sitting on top of the
kernel's key-to-handle mapping?

>  The least that could be done is that a shm/sem/msg could
> be requested by a particular name; the system could then perform whatever
> magic it deems necessary to keep track of it.

Please try to follow the bouncing ball in this simple example that's
been simplified even further by removing error handling for your reading
pleasure:

	ptr = shmat(shmget(ftok("/some/path/name"), ...), NULL, 0);
	strcpy(ptr, "some string to share");

What, can't you handle having to do two operations and keep track of an
intermediate key value to get a handle to some resource?  Are you sure
you don't open your files before you try to read from them?  How about
your database records?  Systems programming in C is not at the same
level as shell scripting or AWK programming.

The limitations of this scheme are no more onerous than those imposed by
use of hard links to share files between directories (i.e. you can't
unlink and re-create a file used to name a resource ID without possibly
destorying the association).

> At least, to narrow the comparison gap momentarily (and conveniently for
> my POV (at least I admit it!)), the programmer has some choice over what
> s/he wants their files to be named for easy access.  IPC does not afford
> such luxury,

Well, in fact SysV IPC does have every such luxury, and even more
flexibiilty on top of that for those who don't need such human oriented
names (this latter ability can be important to embedded systems that
need highly reliable and fault-tolerant IPC mechanisms).

I've never had any problem giving logical, meaningful, human-readable,
reliable, and completely normal, filenames to my SysV IPC resources.

> and while it's not necessary to do so for anything currently
> running, once the process terminates or one wants to pass something manually
> (i.e. typed in), it would be nice to be able to pass a reliable name in.

Given what you say above I don't think you could possibly have ever done
any serious coding with SysV IPC....  I.e you don't seem to have a clue
what you're talking about w.r.t. SysV IPC.  You certainly haven't made
any serious attempt to understand its basic concepts very well.

SysV IPC might seem a little complicated to anyone familiar only with
V7's basic files, pipes, and signals; and indeed true shared memory,
message queues, and semaphores can be overwhelming to anyone unfamiliar
with using such general IPC techniques, but once you've got a grasp of
these basic IPC concepts then the SysV IPC APIs are quite simple and
elegant (especially when compared to those provided by other proprietary
systems designed at around about the same time, or even later), and
despite their simplicity they are still very powerful.  The most complex
task I've put SysV IPC to was a distributed message passing system that
allowed processes to communicate either locally or over RS-232 links
with complete transparency (complete transparency to everything but the
communications latency, of course).

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>