Subject: Re: ltsleep and PCATCH - The Untold Story
To: Stephan Uphoff <ups@stups.com>
From: Bill Studenmund <wrstuden@netbsd.org>
List: tech-kern
Date: 06/27/2003 15:11:39
On Fri, 27 Jun 2003, Stephan Uphoff wrote:
> The manual page sleep(9) states for ltsleep:
> ...
> If the flag PCATCH is OR'ed into priority the pro-
> cess checks for posted signals before and after sleeping.
> ...
> If a ltsleep() returns as a result of a signal, the return value
> is ERESTART if the signal has the SA_RESTART property
> (see sigaction(2)), and EINTR otherwise.
>
> What is does not tell is that with PCATCH the calling lwp might be
> stopped in ltsleep (to be exact in issignal called from ltsleep).
> This can happen if:
> (1) The process is receiving a signal and the signal
> is being traced
> (2) The process receives a signal that is not ignored, has
> the default signal handler installed and has the STOP
> property.
>
> The problem with this is that processes might be stopped while holding
> resources.
> A good example for this are NFS interruptible-mounts.
> A suspend (^Z) on a process waiting inside NFS can cause the process
> to be stopped while holding a vnode lock.
> Combined with the usual vnode lock crabbing this has the potential to lock
> up the entire system.
>
> Maybe we need a PDONTSTOP? :-)
Probably.
We also should be thinking about what locks we hold while sleeping, etc.
Thanks for starting the thread on this.
As I understand things, this is not a new development; we probably had
this issue before we had LWPs.
Take care,
Bill