Subject: pthread assertion "next != 0"
To: None <current-users@netbsd.org>
From: Arto Huusko <arto.huusko@utu.fi>
List: current-users
Date: 02/12/2003 22:17:00
On my just now upgraded -current (sources from two days ago) I got the
following pthread assertion:

assertion "next != 0" failed: file
"/source/current/src/lib/libpthread/pthread_run.c", line 118, function
"pthread__next"

This is one of my own programs, and the situation is this:

 - The app is a GTK+ 2 program (yes, all packages have been rebuilt, the
   issue is not mismatch with old thread libs)
 - When the assert fires, the process contains only one thread (and has
   never contained any other threads)
 - The general context here is this:
   * gdk_threads_enter()
   * gtk_main()
   * gtk signal handler ->
   * code which calls gdk_threads_enter()
 (I'm not sure if I'm supposed to call gdk_threads_enter() while I
 already have the GTK+ lock, but this is another issue).

I currently assume that my program is buggy, and I shouldn't
gdk_threads_enter() if the thread has already done that.

However, there is a problem: after the assertion fires, the program
hangs. SIGKILL and SIGINTR neither work make the process go away.
Ps output confirms only one LWP, whose wchan is simply -

Anyway, here is a trace (obtained by attaching to the process with
gdb). Only the last bits are included:

#0  0x752f3a6e in ?? ()
#1  0x4859d01e in pthread__sched_sleepers () from
/usr/lib/libpthread.so.0
#2  0x4859cce1 in pthread_rwlock_unlock () from /usr/lib/libpthread.so.0
#3  0x48628b73 in fflush () from /usr/lib/libc.so.12
#4  0x485f5f9d in _cleanup () from /usr/lib/libc.so.12
#5  0x4862ec18 in abort () from /usr/lib/libc.so.12
#6  0x48601606 in __assert13 () from /usr/lib/libc.so.12
#7  0x4859cec4 in pthread__next () from /usr/lib/libpthread.so.0
#8  0x4859ce04 in pthread__block () from /usr/lib/libpthread.so.0
#9  0x4859f47c in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#10 0x4859f319 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#11 0x4831b415 in gdk_threads_enter () from
/usr/pkg/lib/libgdk-x11-2.0.so.200

As said above, the above was caused while in gtk_main() and user
clicking a button. gdk_threads_enter() was called just before
gtk_main(), so I'm assuming the GTK lock is held there...