Subject: Re: ltsleep and PCATCH - The Untold Story
To: Stephan Uphoff <ups@stups.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 06/27/2003 15:11:39
On Fri, 27 Jun 2003, Stephan Uphoff wrote:

> The manual page sleep(9) states for ltsleep:
>    ...
>    If the flag PCATCH is OR'ed into priority the pro-
>    cess checks for posted signals before and after sleeping.
>    ...
>    If a ltsleep() returns as a result of a signal, the return value
>    is ERESTART if the signal has the SA_RESTART property
>    (see sigaction(2)), and EINTR otherwise.
>
> What is does not tell is that with PCATCH the calling lwp might be
> stopped in ltsleep (to be exact in issignal called from ltsleep).
> This can happen if:
> 	(1) The process is receiving a signal and the signal
>             is being traced
> 	(2) The process receives a signal that is not ignored, has
>              the default signal handler installed and has the STOP
>              property.
>
> The problem with this is that processes might be stopped while holding
> resources.
> A good example for this are NFS interruptible-mounts.
> A suspend (^Z) on a process waiting inside NFS can cause the process
> to be stopped while holding a vnode lock.
> Combined with the usual vnode lock crabbing this has the potential to lock
> up the  entire system.
>
> Maybe we need a PDONTSTOP? :-)

Probably.

We also should be thinking about what locks we hold while sleeping, etc.
Thanks for starting the thread on this.

As I understand things, this is not a new development; we probably had
this issue before we had LWPs.

Take care,

Bill