Subject: Re: lib/35969 (ghc-6.4.2 from pkgsrc fails to compile)
To: None <ad@NetBSD.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,>
From: Andrew Doran <ad@NetBSD.org>
List: netbsd-bugs
Date: 03/11/2007 18:45:02
The following reply was made to PR lib/35969; it has been noted by GNATS.

From: Andrew Doran <ad@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: lib/35969 (ghc-6.4.2 from pkgsrc fails to compile)
Date: Sun, 11 Mar 2007 18:43:57 +0000

  PID         LID S     FLAGS       STRUCT LWP *            UAREA * WAIT
  12719         3 3      0x84         0xcc1aee20         0xcc118ce0 parked
                2 3      0x84         0xcc1ae820         0xcc144ce0 select
                1 3      0x84         0xcc1b8000         0xcc1e8ce0 parked
 
 So there are two parked threads with no pending wakeup (LW_UNPARKED)
 which is expected. User level backtraces from gcore:
 
 #0  0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
 #1  0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
 #2  0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
 #3  0x08f6bacc in waitCondition ()
 #4  0x0931500c in ?? ()
 #5  0x093058c8 in cached_trec_headers ()
 #6  0x00000000 in ?? ()
 
 #0  0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
 #1  0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
 #2  0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
 #3  0x08f6bacc in waitCondition ()
 
 #0  0xbbad3fbf in select () from /usr/lib/libc.so.12
 #1  0xbbb9a585 in select () from /usr/lib/libpthread.so.0
 #2  0x08df41d1 in s5hN_ret ()
 #3  0x08f707b9 in StgRun ()
 
 Looking at thread 0:
 
 (gdb) frame 1
 #1  0xbbb9befa in pthread__park () from /usr/lib/libpthread.so.0
 (gdb) frame 2
 #2  0xbbb9b45d in pthread_cond_wait () from /usr/lib/libpthread.so.0
 (gdb) info frame
 Stack level 2, frame at 0xbfbfd8d0:
  eip = 0xbbb9b45d in pthread_cond_wait; saved eip 0x8f6bacc
  called by frame at 0xbfbfd8d4, caller of frame at 0xbfbfd890
  Arglist at 0xbfbfd8c8, args:
  Locals at 0xbfbfd8c8, Previous frame's sp is 0xbfbfd8d0
  Saved registers:
   ebx at 0xbfbfd8bc, ebp at 0xbfbfd8c8, esi at 0xbfbfd8c0, edi at 0xbfbfd8c4, eip at 0xbfbfd8cc
 
 Arguments:
 
 (gdb) x/20a 0xbfbfd8c8
 0xbfbfd8c8:     0x9315000       0x8f6bacc <waitCondition+16>    0x931500c       0x93058c8 <sched_mutex>
 0xbfbfd8d8:     0x0     0x0     0x0     0x93058cc <sched_mutex+4>
 0xbfbfd8e8:     0x10000 0x8f64c90 <waitForCapability+29>        0x931500c       0x93058c8 <sched_mutex>
 0xbfbfd8f8:     0x0     0x9306208 <stg_END_TSO_QUEUE_closure>   0xbaab5000      0x9306208 <stg_END_TSO_QUEUE_closure>
 0xbfbfd908:     0xbaef0038      0x8f6e8a6 <schedule+84> 0x93058c8 <sched_mutex> 0xbfbfd958
 
 First arg is a condition variable (magic 0x55550005):
 
 (gdb) x/20a 0x931500c
 0x931500c:      0x55550005      0x0     0xbc000000      0xbc000034
 0x931501c:      0x93058c8 <sched_mutex> 0x0     0x0     0x0
 0x931502c:      0x0     0x0     0x0     0x0
 0x931503c:      0x0     0x0     0x0     0x0
 0x931504c:      0x0     0x0     0x0     0x0
 
 And the pthread noted in it (magic 0x11110001):
 
 (gdb) x/50a 0xbc000000
 0xbc000000:     0x11110001      0x0     0x1     0x1
 0xbc000010:     0x0     0x0     0x0     0x0
 0xbc000020:     0x0     0x1     0x19    0x0
 			^ pt_sleeponq
 
 0xbc000030:     0xb400002c      0x0     0x9315014       0x9315014
 					^ pt_sleepobj	^ pt_sleepq
 
 It appears to be happily asleep, and has not been awoken, so nothing
 wrong here. mutex is noted in the CV so no wakeup has occurred. Loooking
 at thread 2:
 
 (gdb) thread 2
 [Switching to thread 2 (process 209327)]#0  0xbbad4b6b in _lwp_park () from /usr/lib/libc.so.12
 (gdb) info frame
 Stack level 1, frame at 0xb3fffec8:
  eip = 0xbbb9befa in pthread__park; saved eip 0xbbb9b45d
  called by frame at 0xb3ffff08, caller of frame at 0xb3fffe98
  Arglist at 0xb3fffec0, args:
  Locals at 0xb3fffec0, Previous frame's sp is 0xb3fffec8
  Saved registers:
   ebx at 0xb3fffeb4, ebp at 0xb3fffec0, esi at 0xb3fffeb8, edi at 0xb3fffebc, eip at 0xb3fffec4
 (gdb) x/10a 0xb3fffec0
 0xb3fffec0:     0xb3ffff00      0xbbb9b45d <pthread_cond_wait+249>      0xb0000000      0x9304ed0 <thread_ready_cond+4>
 0xb3fffed0:     0x9304ed4 <thread_ready_cond+8> 0x0     0x1     0x9304ed0 <thread_ready_cond+4>
 
 Different CV; however it seems that no thread is asleep on its queue
 in this case.
 
 (gdb) x/10a 0x9304ecc
 0x9304ecc <thread_ready_cond>:  0x55550005      0x0     0x0     0x9304ed4 <thread_ready_cond+8>
 0x9304edc <thread_ready_cond+16>:       0x0     0x0     0x16    0x0
 
 Finding the thread from the locals and seeing what it is doing.
 
 (gdb) x/10a 0xb3fffec0
 0xb3fffec0:     0xb3ffff00      0xbbb9b45d <pthread_cond_wait+249>      0xb0000000      0x9304ed0 <thread_ready_cond+4>
 0xb3fffed0:     0x9304ed4 <thread_ready_cond+8> 0x0     0x1     0x9304ed0 <thread_ready_cond+4>
 0xb3fffee0:     0x0     0xbbb9b372 <pthread_cond_wait+14>
 
 Here's the thread:
 
 (gdb) x/50a 0xb0000000
 0xb0000000:     0x11110001      0x2     0x3     0x1
 0xb0000010:     0x0     0x1     0x0     0x0
 0xb0000020:     0x0     0x0     0x0     0xb4000000
 			^ pt_sleeponq
 
 0xb0000030:     0xbbb9fca8 <pthread__allqueue>  0x0     0x9304ed4 <thread_ready_cond+8> 0x0
 						^ pt_sleepobj	^ pt_sleepq
 
 The thread is not sleeping and is not on a sleep queue, yet it's parked.
 pt_sleepq indicates that it was last waiting on the the CV from the
 stack trace. From the CV above, the mutex pointer is NULL meaning that
 a wake up has occurred - no more waiters on the CV so the mutex pointer
 has been cleared. So there is some synchronization failure occurring
 between removing the LWP from its sleep queue and unparking it.