Subject: Re: POSIX.4 real-time extensions?
To: Alex Barclay <>
From: ozan s. yigit <>
List: tech-kern
Date: 07/05/2001 14:57:22
> Is anything being done for the real time clock (timer_create etc.)

i'm interested in looking into rt clock extensions because it is a nice
and tricky area. as time permits of course.

as a side note, here is a overview/summary of all POSIX-4 features as found
in the standard. this was put together relatively quickly, and needs to be
beefed up a little bit, so any comments/corrections/suggestions would be

cheers...	oz
ozan s. yigit			sun microsystems [global eservices eng] || +1 [905] 415 2878
Sometimes it's better to light a flamethrower than curse the darkness.
				-- Terry Pratchett ("Men At Arms")

--- snip snip ---

: To unbundle, sh this file
echo x - 1>&2
sed 's/^X//' > <<'@@@End of'
X.\" $Id$
X.\" various handy macros hl 
X.. P1 P1 .2i
X.if \\n(.$ .nr P1 \\$1 dT 8 \\n(P1u -1		\" reduce the size a whole point
X.ft C
X.vs -.5p	\" squeeze C a bit closer
X.sp .5 t \\n(dT*\\w'x'u
X.ta 1u*\\ntu 2u*\\ntu 3u*\\ntu 4u*\\ntu 5u*\\ntu 6u*\\ntu 7u*\\ntu 8u*\\ntu 
9u*\\ntu 10u*\\ntu 11u*\\ntu 12u*\\ntu 13u*\\ntu 14u*\\ntu
X.. P2
X.ft 1
X.sp .5 +1
X.vs +.5p
X.\" CW uses the typewriter/courier font. CW
X.\" Footnote numbering [by Henry Spencer]
X.\" <text>\*f for a footnote number..
X.\" .FS
X.\" \*F <footnote text>
X.\" .FE
X.ds f \\u\\s-2\\n+f\\s+2\\d f 0 1
X.ds F \\n+F. F 0 1
@@@End of
echo x - 1>&2
sed 's/^X//' > <<'@@@End of' PS 11 VS 13
X.ds LH POSIX.4 Overview
X.ds CH "
X.ds RH "Page % l
XAn Overview of POSIX.4
XOzan S. Yigit
XDepartment of Computer Science
XYork University
XPOSIX.4 is a set of real-time extensions to POSIX.1. This document summarizes 
Xextensions, with a brief summary of each function in various functional areas
Xof these extensions.
X.sp 2 '''\fIAncient Principle of \fBWYGIWYGAINGW:\fR '''\fIWhat You Get Is What You\'re Given, And It\'s No Good Whining.\fR '''\fI--Terry Pratchett et al. (The Science of Discworld)\fR
XPOSIX, the Portable Operating System Interface, is a family of ANSI and ISO
Xstandards. The first of these standards, POSIX.1, (IEEE std. 1003.1) attempts
Xto define a standard operating system interface based on the UNIX operating
Xsystem documentation, to support application portability at the source level.
XPOSIX 4 (officially called IEEE Std 1003.1b-1993), approved in september 1993,
Xis a set of real-time extensions to POSIX.1. This document summarizes these
Xextensions, with a brief summary of each function in various functional areas
Xof these extensions.
XPOSIX 4 uses the following definition of 
X.I realtime
Xfor its scope:
XRealtime in operating systems: the ability of the operating system to
Xprovide a required level of service in a bounded response time.
XThe key elements [P1003.1b] defining the scope are:
Xdefining a sufficient set of functionality to cover a significant
Xpart of the realtime application program domain
Xdefining sufficient performance constraints and performance related
Xfunctions to allow a realtime application to achieve deterministic
Xresponse from the system
Xand in addition:
Xdefining interfaces that do not preclude high-performance implementations
Xon on traditional uniprocessor realtime systems.
XPOSIX.4 real-time extensions are grouped into several functional areas.
XThe following are the real-time functional areas, and and their scope
Xas defined in [P1003.1b]:
X.IP "\fBsemaphores\fR"
Xa minimum synchronization primitive to serve as a basis for more
Xcomplex synch mechanisms to be defined by the application program
X.IP "\fBprocess memory locking\fR"
Xa performance improvement facility to bind processes into the
Xhigh-performance random access memory of the system. this avoids
Xlatencies introduced by the OS in storing unreferenced parts of the
Xprogram in the secondary memory devices (eg. paging).
X.IP "\fBmemory mapped files and shared memory\fR"
Xa performance improvement facility to allow for programs to access
Xfiles as a part of the program images and for seperate app programs to
Xhave portions of their program image generally accessible
X.IP "\fBpriority scheduling\fR"
Xa performance and determinism improvent facility to allow apps
Xto determine the order in which processes that are ready to run
Xare granted access to processor resources
X.IP "\fBrealtime signal extension\fR"
Xa determinism improvement facility that augments the signal mechanism
Xof historical implementations to enable asynchronous signal
Xnotifications to an application to be queued without impacting
Xcompatibility with the existing signal interface
X.IP "\fBtimers\fR"
Xa functionality and determinism improvement facility to
Xincrease the resolution and capabilities of the time base
X.IP "\fBinterprocess communication\fR"
Xa functionality enhancement to add a high-performance, deterministic
Xinterprocess communication facility for local communication.
X.IP "\fBsynchronized input and output\fR"
Xa determinism and robustness improvement mechanism to enhance the
Xdata input and output mechanisms so that an application can insure that the
Xdata being manipulated is physically present on secondary mass
Xstorage devices
X.IP "\fBasynchronized input and output\fR"
Xa functionality enhancement to allow an application process to
Xqueue data input and output commands with asynchronous	notification
Xof completion. [This facility includes in its scope the requirements
Xof supercomputer applications]
XWe summarize each one of thes functional areas in turn, and show the
Xprogramming interfaces for each area.
XPosix semaphores provide a minimum synchronization primitive between multiple
Xprocesses that share memory or mapped files. A semaphore allows guarded access
Xto a resource or allows processes to wait for some change to happen. Posix
Xsemaphores are
X.I counted
Xsemaphores; Processes 
X.I wait
Xon or
X.I post
Xa semaphore
X(Edsger Dijkstra's P (proberen) and V (verhogen) operations [UNPV22E]), 
decrement and
Xincrement a counter associated with the semaphore. POSIX.4 defines
Xtwo types of semaphores:
X.I named
X.I unnamed
X.I memory-based
XNamed semaphores require names constructed like a normal file pathname. To
Xrun on all systems, a name must start with a "/" character (but may not
Xcontain other "/" characters in the name)\*f.  A memory-based semaphore only
Xrequires a specific memory address.
X\*F These naming restrictions also apply to message queue and shared memory
Xobject names.
XP1003.1b Semaphore Functions (\fC_POSIX_SEMAPHORES\fR)
X.IP \fCsem_init\fR
XInitializes an \fIunnamed\fR semaphore at a given location.
X[semaphore can be used for \fCsem_wait\fR, \fCem_trywait\fR, \fCsem_post\fR
Xand \fCsem_destroy\fR]
X.IP \fCsem_open\fR
XOpens/creates a \fInamed\fR semaphore.  [semaphore can be used for
X\fCsem_wait\fR, \fCsem_trywait\fR, \fCsem_post\fR and \fCsem_close\fR]
X.IP \fCsem_getvalue\fR
XGets the value of a specified semaphore without
Xaffecting the state of the semaphore.
X.IP \fCsem_post\fR
XUnlocks a locked (named or unnamed) semaphore.
XIf the resulting value is
Xpositive, then no processes were blocked waiting for the
Xsemaphore to be unlocked; the semaphore value is incremented.
X.IP \fCsem_wait\fR
XPerforms a semaphore lock on a (named or unnamed) semaphore.
XIf the semaphore
Xvalue is zero, this function waits until it either locks the
Xsemaphore, or the call is interrupted by a signal.
X.IP \fCsem_trywait\fR
XPerforms a semaphore lock on a (named or unnamed) semaphore
Xonly if it
Xis currently not locked, that is the semaphore value is
Xcurrently positive. Otherwise, it returns without waiting
Xto lock the semaphore.
X.IP \fCsem_unlink\fR
XRemoves a specified named semaphore. Processes that have
Xthe semaphore open can continue to use it. The semaphore
Xwill be actually removed only after all process close it.
X.IP \fCsem_close\fR
XTerminates access to a named semaphore. This does NOT
Xremove a semaphore. 
X.IP \fCsem_destroy\fR
XDestroys an unnamed semaphore at a given location.
X[Note that if the system detects other processes still
Xusing the semaphore, it may return EBUSY error for this
XProcess Memory Locking
XMemory locking guarantees the memory residence of portions of the address
Xspace. Under POSIX.4, a process can lock all or a portion (range) of the
Xpages mapped to its address space. These per-process memory locks
Xare not inherited across a fork() and all memory locks owned by a process
Xare unlocked upon exec() or process termination.
XP1003.1b Memory Locking Functions  (\fC_POSIX_MEMLOCK\fR and 
X.IP \fCmlockall\fR
Xlocks all of the pages mapped by a process's address space
Xin physical memory. These pages cannot be paged or swapped
Xto disk. [allows FUTURE locking]
X.IP \fCmunlockall\fR
XUnlocks all currently mapped pages of a process's address
X.IP \fCmlock\fR
XLocks a specified section (a range between two addresses)
Xof a process's address space in memory. These pages cannot
Xbe paged or swapped to disk.
X.IP \fCmunlock\fR
XUnlocks a specified (previously locked) section of a
Xprocess's address space.
XMemory Mapped Files and Shared Memory
XMemory mapping establishes a mapping between the address space of the
Xthe process and a memory object represented by a file descriptor. The
Xdescriptor may point to an actual file, or it may be a shared memory
X(see \fCshm_open\fR) segment. [as result, a shared memory is first opened,
Xand than mapped; what should take one step takes two. This is
Xdue to a previously existing \fCmmap\fR call] [UNPV22E]
XP1003.1b Memory Mapping Functions (\fC_POSIX_MAPPED_FILES\fR)
X.IP \fCmmap\fR
Xestablishes a virtual mapping between the address space of a process
Xand a specified memory object. This allows the contents of the
Xobject to appear as a part of the process's memory.
X.IP \fCmunmap\fR
Xremoves any mappings for specified section of the address space.
X.IP \fCmprotect\fR
Xchanges the access protections (RWEN) for a mapped section
Xof the address space.
X.IP \fCmsync\fR
Xsynchronizes all modified data in the specified section
Xof the address space with the underlying object. If the
Xunderlying object is a file, mapped section is written
Xto permanent storage.
XP1003.1b Shared Memory Functions (\fC_POSIX_SHARED_MEMORY_OBJECTS\fR)
X.IP \fCshm_open\fR
XOpens or creates a shared-memory object and returns
Xa file descriptor. [This descriptor is used with mmap
Xto map the object into a process's address space]
X.IP \fCshm_unlink\fR
XDestroys the named shared memory object, and removes
Xthe name of the shared-memory object. Processes which
Xhave opened or mmaped the object can still use it, until
Xall processes close the object and munmap it.
XPriority Scheduling
XIn a real-time operating environment, we would like to do one or more
Xof the following: [POSIX.4]
X.IP \(bu
Xmake sure something happens at or before a specific time
X.IP \(bu
Xmake sure something happens before something else
X.IP \(bu
Xmake sure something is not delayed if it is not
Xdesigned to be delayed
X.IP \(bu
Xmake sure scheduling guarantees are met
XPOSIX.4 provides priority-based scheduling policies. These are
X(conceptually) lists of processes, one list per
Xpriority. A POSIX.4 scheduling policy defines allowable operations
X(eg. moving processes between and within lists) on this set of lists.
XAssociated with each policy is a priority range. POSIX.4 defines three
Xschedule policies:
Xpreemptive, priority-based scheduling.
XThis is a common scheduling policy found in many
Xreal-time systems. this policy is usually implemented
Xas an array of FIFO queues, one queue per priority level.
XUnder this policy:
X.IP [1]
Xwhen a running process is preempted, it
Xbecomes the head of the process list for its priority.
X.IP [2]
Xwhen a blocked process becomes runnable, it becomes
Xthe tail of the process list for its priority.
X.IP [3]
Xa running process can call \fCsched_setscheduler\fR
Xto modify a specified process's policy and priority.
Xif that process is running or runnable, it becomes
Xthe tail of the process list for its new priority
X.IP [4]
Xa running process can call \fCsched_setparam\fR to modify
Xthe priority of a specified process's priority.
X.IP [5]
Xwhen a running process calls \fCsched_yield\fR, it becomes
Xthe tail of the process list for its priority.
X.IP [6]
Xat no other time does the position of a process
Xin a process list with this policy is effected.
Xpreemptive, priority based round-robin scheduling with
Xquanta. This is identical to SCHED_FIFO with the addition that
Xeach process has a time quantum. a running process gets preempted
X[and inserted at the end of the process list for the same
Xpriority level] if it runs longer than its quantum, and
Xother processes of the same priority level are waiting
Xin the queue. A process under this policy that is preempted
Xand subsequently resumes execution completes the unexpired
Xportion of its quantum.
Xthis is an implementation-defined scheduler.
XP1003.1b Priority Scheduling Functions (\fC_POSIX_PRIORITY_SCHEDULING\fR)
X.IP \fCsched_getscheduler\fR
XReturns the scheduling
Xpolicy (identifier \fCSCHED_FIFO\fR, \fCSCHED_RR\fR, \fCSCHED_OTHER\fR)
Xof a specified process
X.IP \fCsched_getparam\fR
XReturns the scheduling
Xpriority of a specified process
X.IP \fCsched_get_priority_max\fR
XReturns the maximum
Xpriority value allowed for a scheduling policy
X.IP \fCsched_get_priority_min\fR
XReturns the minimum
Xpriority value allowed for a scheduling policy
X.IP \fCsched_rr_get_interval\fR
XReturns the current quantum (timespec) for the
Xround-robin scheduling policy
X.IP \fCsched_setscheduler\fR
XSets the scheduling policy and priority
Xof a specified process
X.IP \fCsched_setparam\fR
XSets the scheduling
Xpriority of a specified process
X.IP \fCsched_yield\fR
XYields execution to another
XRealtime Signal Extension
XSignals are an integral part of the POSIX world for exception
Xhandling, process notifications, interprocess communication, etc,
Xas defined in POSIX.1. In POSIX.4, a new range of signals are
Xdefined (\fCSIGRTMIN\fR to \fCSIGRTMAX\fR) for application use. POSIX.4
XSignals can be queued and carry extra data, such as an integer
Xor pointer data value. Real time signals are delivered in
Xorder, lowest numbered signal first. [so one would use
Xthe low numbers for higher-priority signals] and are
Xreceived faster.
XP1003.1b Realtime Signal Functions (\fC_POSIX_REALTIME_SIGNALS\fR)
X.IP \fCsigaction\fR
XSpecifies the action a process takes
Xwhen a particular signal is delivered (POSIX.4
X.IP \fCsigqueue\fR
XSends a a specified signal, plus identifying
Xinformation, to a process. If the resources are
Xavailable, the signal is queued
Xfor the receiving process.
X.IP \fCsigtimedwait\fR
XWaits for a signal for a specified amount
Xof time and, if the signal is delivered within that time,
Xreturns the signal number and any identifying information
Xprovided by the signaller.
X.IP \fCsigwaitinfo\fR
XWaits indefinitely for a signal and, upon its delivery,
Xreturns the signal number and any identifying information
Xprovided by the signaller.
XPOSIX.4 defines a set of clock and timer functions that meet the
Xrequirements of many real-time applications. These clocks and timers are
Xsimilar to those found in the Berkeley and AT&T UNIX systems, but with
Ximprovements: they support additional clocks [All POSIX.4 systems 
X.I must
Xsupport \fCCLOCK_REALTIME\fR], allow greater time resolution,
Ximplementation defined timers, and more flexible timer signal delivery.
XP1003.1b Clock Functions (\fC_POSIX_TIMERS\fR)
X.IP \fCclock_getres\fR
XReturns the resolution of the specified clock.
XEvery POSIX.4 system must support at least one
Xclock, identified as \fCCLOCK_REALTIME\fR. This clock
Xmust support a resolution of at least 50Hz, or
X20,000,000 nanoseconds.
X.IP \fCclock_gettime\fR
XReturns the current value for the specified clock
X.IP \fCclock_settime\fR
XSets the specified clock to the specified value
XP1003.1b Timer Functions (\fC_POSIX_TIMERS\fR)
X.IP \fCnanosleep\fR
XCauses the calling process to suspend execution for a specified number
Xof nanoseconds.  [This is a higher-resolution version of \fIsleep(3)\fR] The
Xspecified value may be rounded up to the resolution of the system clock.
X.IP \fCtimer_create\fR
XCreates an interval timer based on a particular system clock.  [Usually
X\fCCLOCK_REALTIME\fR] Timer_create Returns a unique timer ID used in 
Xcalls to identify the timer.
X.IP \fCtimer_gettime\fR
XReturns the amount of time before the specified timer is due to expire
Xand the repetition value (ie. the interval between succesive expirations)
X.IP \fCtimer_settime\fR
XSets the value of the specified timer to either an offset from the
Xcurrent clock setting or to an absolute time value.
X.IP \fCtimer_delete\fR
XRemoves a previously created timer, and frees up its resources.
X.IP \fCtimer_getoverrun\fR
XReturns the timer expiration overrun count for the specified timer.
X[this is the number of timer expirations that occurred between the time
Xtimer expiration signal was queued, and the time at which the signal
Xwas delivered]
XInterprocess Communication
XPOSIX.4 Message queues are intended as an flexible and efficient means
Xof communication between multiple processes. It allows sending and
Xreceiving of messages, does message prioritization, and provides
Xasynchronous process notification (for only 
X.I one
Xprocess). Processes
Xcan query the number of messages in the queue, the length of the queue
Xor the maximum size of a message.
XP1003.1b Message Functions (\fC_POSIX_MESSAGE_PASSING\fR)
X.IP \fCmq_open\fR
XOpens a named message queue. [Message queue
Xnaming rules are the same as semaphores]
X.IP \fCmq_getattr\fR
XGets the attributes of a message queue.
X.IP \fCmq_notify\fR
XRequests that a process be notified when a
Xmessage is available on an empty message queue
X.IP \fCmq_receive\fR
XReceives a message from the message queue
X.IP \fCmq_send\fR
XSends a message on a message queue
X.IP \fCmq_setattr\fR
XSets the attributes of a message queue. Only
Xthe flags attribute can be set, which includes
Xa flag (MQ_NONBLOCK) that alters the behaviour
Xof mq_receive.
X.IP \fCmq_close\fR
XCloses a message queue. Message queues are
Xpersistent; messages remain in the queue even
Xafter a queue is closed.
X.IP \fCmq_unlink\fR
XRemoves a message queue. A message queue only
Xgoes away after all processes that have the queue
Xopen, close it.
XSychronized Input and Output
XPOSIX.4 provides for
X.I synchronized
Xinput and output. When I/O is synchronized, it is considered complete
Xonly when the underlying device is properly updated; For example, a
X.I synchronized
Xwrite does not complete until the data is written to disk (or tape etc).
X[This is different than
X.I synchronous
Xinput and output, which just means I/O takes place while the caller waits.]
XP1003.1b Synchronized I/O Functions (\fC_POSIX_SYNCHRONIZED_IO\fR)
X.IP \fCfcntl\fR
XControls operations on files and memory objects.
X[especially \fCO_DSYNC\fR and \fCO_SYNC\fR]
X.IP \fCfdatasync\fR
XFlushes modified data only (possibly leaving the
Xcontrol information inconsistent) from the buffer cache,
Xproviding operation completion with data integrity.
X.IP \fCfsync\fR
XFlushes modified data and file control information
Xfrom the buffer cache, providing operation completion
Xwith file integrity.
XAsynchronized Input and Output
XPOSIX.4 Asynchronized input and output extensions provide
Xthe ability to perform I/O in parallel with other operations of an
Xapplication, as needed in many real-time applications. When an
Xasynchronous \fCread\fR or \fCwrite\fR call is issued, the operating
Xsystem queues the request and immediately returns the control to the
Xapplication. I/O is performed in parallel with the application.
XOptionally, the application can be notified for I/O
Xcompletion with a signal. [important: if \fC_POSIX_PRIORITIZED_IO\fR
Xand \fC_POSIX_PRIORITY_SCHEDULING\fR are defined, then asynchronous I/O
Xis queued in priority order, using the current scheduling priority]
XP1003.1b Asynchronous I/O Functions (\fC_POSIX_ASYNCHRONOUS_IO\fR)
X.IP \fCaio_cancel\fR
XTries to cancel one or more asyncronous I/O requests pending against a
Xfile descriptor.
X.IP \fCaio_error\fR
XReturns the error status for a specified asynchronous operation.
X.IP \fCaio_fsync\fR
XAsynchronously writes system buffers containing
Xa file's modified data to permanent storage.
X.IP \fCaio_read\fR
XInitiates an asynchronous read request on the specified file
X.IP \fCaio_return\fR
XRetrieves the return status of a completed I/O operation.
X.IP \fCaio_suspend\fR
XSuspends the calling process until at least
Xone of the specified asynchronous I/O requests has completed
X.IP \fCaio_write\fR
XInitiates an asynchronous write request to the specified
Xfile descriptor
X.IP \fClio_listio\fR
XInitiates a list of I/O requests
X.IP "[P1003.1B]" 6m
XIEEE Std 1003.1b-1993 IEEE Standard For Information Technology,
X\fIPortable Operating System Interface (POSIX) Part 1: System Application
XProgram Interface, Amendment 1: Realtime Extension.\fR, IEEE Press,
X1994, New York.
X.IP "[POSIX.4]" 6m
XBill O. Gallmeister, \fIPOSIX.4: Programming For The Real World\fR,
XO'Reilly & Associates, Jan 1995.
X.IP "[UNPV22E]" 6m
XW. Richard Stevens, \fIUnix Network Programming Volume 2: Interprocess
XCommunications\fR (2nd ed) Prentice Hall, 1999.
XThe following POSIX 1003.1b-1993 compile-time symbolic constants 
Xmay be defined in \fC<unistd.h>\fR, and indicate which optional facilities
Xare present.
XAs this list indicates, almost the entire real time functionality
Xis optional. Only the real-time, queued signals are not optional.
XExample feature checking code:
X	#define _POSIX_C_SOURCE 199309
X	#include <unistd.h>
X	...
X	struct p4def {
X		char *p4option;
X		int support;
X	} p4sup[] = {
X		{ "asynchronous io",
X		1
X	#else
X		0
X	#endif
X		},
X		{ "fsync", 
X		1
X	#else
X		0
X	#endif
X		},
X	...
X	for (i = 0; i < TABSIZE; i++)
X		printf("%s %s\n", p4sup[i].p4option,
X		       p4sup[i].support ? "supported" : "not supported");
XThis feature checking code prints the following under Solaris 7:
Xasynchronous io: supported
Xfsync: supported
Xmapped files: supported
Xmemlock: supported
Xmemlock range: supported
Xmemory protection: supported
Xmessage passing: supported
Xprioritized i/o: not supported
Xpriority scheduling: supported
Xreal-time signals: supported
Xsemaphores: supported
Xshared memory: supported
Xsynchronized i/o: supported
Xtimers: supported
XSee Gallmeister's excellent book [POSIX.4] for more examples.
@@@End of