Subject: kern/35932: _lwp_wait can return EDEADLK when it should not
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <ad@netbsd.org>
List: netbsd-bugs
Date: 03/06/2007 02:20:01
>Number:         35932
>Category:       kern
>Synopsis:       _lwp_wait can return EDEADLK when it should not
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 06 02:20:01 +0000 2007
>Originator:     Andrew Doran
>Release:        NetBSD 4.99.13
>Organization:
The NetBSD Project
>Environment:
N/A
>Description:
Three threads in a program make the following syscalls:

    1    _lwp_wait(0, &foo)
    2    _lwp_wait(3, &foo)
    3    _lwp_exit()

When thread 3 exits, thread 2 should be notified. However depending on
the order of execution, the kernel can detect that thread 1 and 2 are
about to deadlock -- after thread 3 exits, both would be sitting in
_lwp_wait. Sometimes one of thread 1,2 will return with status EDEADLK.
>How-To-Repeat:
1. Install audio/bmp from pkgsrc.
2. Start playing an mp3.
3. Exit bmp while the mp3 is playing.
4. Occasionally, it will core dump and report "deadlock avoided".

Thanks to xtraeme@ for noting the problem.
>Fix:
Maybe:

1. Note the number of LWPs waiting for a specific LID.
2. Take this into account when checking for deadlock.