tech-kern: Re: fsync performance hit on 1.6.1

Subject: Re: fsync performance hit on 1.6.1
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Greg A. Woods <woods@weird.com>
List: tech-kern
Date: 07/09/2003 03:01:30
[ On Tuesday, July 8, 2003 at 23:09:19 (-0400), der Mouse wrote: ]
> Subject: Re: fsync performance hit on 1.6.1
>
> A new namespace for each resource.

Think of each type of IPC entity as existing in a separate filesystem.

> A new _flat_ namespace for each resource.
> 
> A new flat namespace _with human-meaningless names_ for each resource.

Have you never heard of inode numbers?   :-)

Perhaps you're not aware of ftok(3) and its use to map normal pathnames
into IPC identifiers?

> The worst of both the persistent and transient worlds: no cleanup on
> process exit - but loss upon rebooting.

I think you've mis-interpreted what I believe was a design goal of SysV
IPC.  The IPC entities are not supposed to go away when the last process
exits -- they are supposed to persist so that a process can come along
later and re-attach to them.

Asking SysV IPC entities to disappear on last exit of all processes
which have interacted with them would be like asking all files to
disappear on last close!

There's nothing fundamentally wrong with them disappearing on reboot
though -- in that sense they are no different than a memory filesystem.
(or pipes).

(It might have been nice to have some way to explicitly ask to "unlink"
an IPC entity such that it would be cleaned up on last exit, but the
lack of this feature has never caused me any great problem.)

> > it is unfortunate that you can't use poll() on message queues (sysV
> > or posix).
> 
> Unfortunate?  I would call it fatal.

poll() came along to the systems in question quite a bit later than
message queues -- at the time the only things that you might want to be
able to do simultaneous non-blocking reads on were TTYs and the best you
could do with anything at the time were manually polled non-blocking
reads and when that's what you have to do then adding a msgrcv() call
using IPC_NOWAIT to the loop is no big deal.

BTW, it is only not portable to use poll() or select() on message queue
identifiers -- it is possible to use select() on at least one more
modern implementation (AIX).

Also, BTW, it seems I am indeed wrong about never being able to use
poll() or select() on POSIX message queues -- POSIX does specifically
allow message queue descriptors to be implemented as file descriptors
and so it should be possible to implement them in such a way that poll()
or select() will do the right thing for them.  On the other hand this
can't be relied upon by a portable application.  Luckily POSIX message
queues have a way to establish a notify callback function that will be
called just like a signal handler whenever a message appears in an empty
queue (though the notifier does have to be re-established every time it
triggers and that must be done before the queue is read else a message
could appear before the notifier is set and it would never trigger until
the queue was emptied).

Now if you want something along these lines that has a bizzare API with
questionable utility then have a look at POSIX shm_open() and
shm_unlink().  They are pretty much the functional equivalent of open()
and unlink() in *BSD but in POSIX compatible systems that do not support
the "Memory Mapped Files" option then shm_open() must be used to obtain
the file descriptor for mmap(), and shm_unlink() must be used to remove
a shared memory "object".  To quote the rationale from P1003.1-2001:

     On implementations where memory objects are implemented using the
     existing file system, the shm_open() function may be implemented
     using a macro that invokes open(), and the shm_unlink() function
     may be implemented using a macro that invokes unlink().

There had to have been some pretty strange politics going on to have
forced the creation of the POSIX shared memory objects API even when the
old POSIX mmap() was already a well known option!

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>