Subject: lib/20580: assertion failures in pthreads
To: None <gnats-bugs@gnats.netbsd.org>
From: None <darrenr@pobox.com>
List: netbsd-bugs
Date: 03/04/2003 17:38:41
>Number:         20580
>Category:       lib
>Synopsis:       assertion failures in pthreads
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    lib-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 04 17:39:00 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Darren Reed
>Release:        1.6N
>Organization:
>Environment:
NetBSD netbsd 1.6N NetBSD 1.6N (GENERIC) #0: Tue Feb 11 00:12:08 UTC 2003     autobuild@tgm.daemon.org:/autobuild/HEAD/i386/OBJ/autobuild/HEAD/src/sys/arch/i386/compile/GENERIC i386

>Description:
This program has been compiled with gcc-3_3-branch as of 4 March 2003
on a 1.6N system, using the February xx snapshort for i386.  The
program is written in C++, using STL.  After running for some time,
it eventually dies like this:

assertion "qhead->pt_spinlocks == 0" failed: file "/autobuild/HEAD/src/lib/libpthread/pthread_run.c", line 225, function "pthread__sched_bulk"
assertion "thread->pt_type == PT_THREAD_NORMAL" failed: file "/autobuild/HEAD/src/lib/libpthread/pthread_run.c", line 136, function "pthread__sched"
Abort (core dumped)

Using gdb 5.3, the stack trace is revealed as:
(gdb) where
#0  0x081272cb in kill ()
#1  0x08151a80 in abort ()
#2  0x08133933 in __assert13 ()
#3  0x080bb8c4 in pthread__sched ()
#4  0x080bb66b in pthread_rwlock_unlock ()
#5  0x0814cd8b in fflush ()
#6  0x0812df2c in _cleanup ()
#7  0x08151a70 in abort ()
#8  0x08133933 in __assert13 ()
#9  0x080bbbf5 in pthread__sched_bulk ()
#10 0x080bbb91 in pthread__sched_idle2 ()
#11 0x080ba000 in pthread__upcall ()

>How-To-Repeat:
Running the application under ktrace seems to exacerbate the problem
enough that it crashes almost instantly with the stack trace like this:

#0  0x0812727b in _sys_nanosleep ()
#1  0x080b6422 in nanosleep ()
#2  0x080c100b in ot::Thread::Sleep(long, long) (millis=5000, nanos=0)
    at base/Thread.cpp:187
#3  0x080c0f67 in ot::Thread::Sleep(long) (millis=5000) at base/Thread.cpp:143
#4  0x0806c973 in std::CSystemHealthMonitorThread::run() (this=0x8379100)
    at CSystemHealthMonitorThread.cpp:67
#5  0x080c12f9 in ot::Thread::doRun() (this=0x8379100) at base/Thread.cpp:266
#6  0x080c1165 in CelThreadFunc (pArg=0x8379100) at base/Thread.cpp:239
#7  0x080b6d0d in pthread_create ()

The end of the ktrace output is like this:
   466 unixcapture CALL  select(0x100,0x4873ff48,0,0,0x4873ff18)
   466 unixcapture CALL  nanosleep(0x4877fed0,0)
   466 unixcapture CALL  sa_yield
   466 unixcapture CALL  sa_yield
   466 unixcapture RET   nanosleep 0
   466 unixcapture CALL  write(0x2,0x484ff324,0x8f)
   466 unixcapture GIO   fd 2 wrote 143 bytes
       "assertion "qhead->pt_spinlocks == 0" failed: file "/autobuild/HEAD/src\
        /lib/libpthread/pthread_run.c", line 225, function "pthread__sched_bul\
        k"
       "
   466 unixcapture RET   write 143/0x8f
   466 unixcapture CALL  __sigprocmask14(0x3,0x484ffbfc,0)
   466 unixcapture RET   __sigprocmask14 0
   466 unixcapture CALL  write(0x2,0x484ff224,0x95)
   466 unixcapture GIO   fd 2 wrote 149 bytes
       "assertion "thread->pt_type == PT_THREAD_NORMAL" failed: file "/autobui\
        ld/HEAD/src/lib/libpthread/pthread_run.c", line 136, function "pthread\
        __sched"
       "
   466 unixcapture RET   write 149/0x95
   466 unixcapture CALL  __sigprocmask14(0x3,0x484ffafc,0)
   466 unixcapture RET   __sigprocmask14 0
   466 unixcapture CALL  getpid
   466 unixcapture RET   getpid 466/0x1d2
   466 unixcapture CALL  kill(0x1d2, SIGABRT)
   466 unixcapture RET   kill 0
   466 unixcapture PSIG  SIGABRT SIG_DFL
   466 unixcapture NAMI  "unixcapture.core"
   466 unixcapture RET   select -1 errno 4 Interrupted system call
   466 unixcapture RET   nanosleep -1 errno 4 Interrupted system call
   466 unixcapture RET   sa_yield 0

>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted: