Subject: Re: Sempahore on NetBSD work or no ?
To: Nathan J. Williams <nathanw@wasabisystems.com>
From: Zoltan ZSIDO <zsido@westel900.net>
List: netbsd-help
Date: 01/28/2003 19:44:51
On 28 Jan 2003, Nathan J. Williams wrote:

> Date: 28 Jan 2003 13:09:24 -0500
> From: Nathan J. Williams <nathanw@wasabisystems.com>
> To: Zoltan ZSIDO <zsido@westel900.net>
> Cc: David Maxwell <david@vex.net>, Jason R Thorpe <thorpej@wasabisystems.com>,
>      Daniel Dias Gonçalves <f22@proveritauna.com.br>,
>      current-users@netbsd.org, netbsd-help@netbsd.org
> Subject: Re: Sempahore on NetBSD work or no ?
>
> Zoltan ZSIDO <zsido@westel900.net> writes:
>
> > If You try select(2) in a threaded application for example, it will cause
> > alarm(3) signal to be delivered on event to the issuer thread if it runs
> > on HP-UX or OSF/1, but will result different behaviour on Solaris.
>
> Wow, that's bogus. That sounds like they tried to implement the
> select() timeout with the alarm timer, and didn't get it quite
> right. I'd believe older HP-UX had that problem, but OSF/1 (now Tru64)
> generally has a solid threads implementation...

DEC OSF/1 for me :-) And yes. It has the best pthread implementation I've
ever seen.

>
> > (Strictly speaking Solaris is closer to the standard, the others are nicer
> > to the programmer, and needless to say all of them are compliant.)
>
> Getting SIGALRM is better than getting the 0 timeout return? And is
> compliant? I don't think so.
>

Sorry I was amiguous. Using DEC's and HP's implementation You will not
suffer from select(2) caused problems. I'v seen this 'problem' only on
Solaris. The compliancy is about threaded signal handling. The
specification doesn't specify that the issuer of an alarm(2) call will be
signalled by the SIGALRM. It says only, that the signal will be delivered
to the process, so the next thread which is capable of handling this
signal will catch it or the process will be killed if no one thread
willing to catch it. All of the mentioned implementation conform to this
specification, but - as an undocumented feature - on OSF/1 and HP-UX it
will be delivered to the issuer thread always, on Solaris this will not be
garantied.

If You have 2 select(2) at the same time and the SIGALRM will be routed
always to the same thread because of the internal thread scheduling,
theoretically You will never return from the other select(2) call, since
the select(2) can be interrupted by any signal, so You will enclose the
select(2) within a loop to catch this type of interrupts and reenter the
select(2) if the time is not elapsed. So You eat the event of the other
select(2) function and have no option to wake up the right thread.

(Theoretically You can write Your own global alarm handler and daisy chain
all thread issuing a function call, which uses the SIGALRM, and redeliver
the signal via the pthread_kill(3) to the next threead on the chain, if
the previous one doesn't 'needed' it, but it is a littlebit complicated.)

Zoltan