Subject: Re: "parked" processes hell: debug available..
To: George Michaelson <ggm@apnic.net>
From: Tobias Nygren <tnn@NetBSD.org>
List: current-users
Date: 09/14/2007 12:11:27
On Fri, 14 Sep 2007 12:17:47 +1000
George Michaelson <ggm@apnic.net> wrote:

> I'm in a world of hurt with claws-mail going into "parked" state if I
> do an (as yet unknown) number of things. There is no return from this place.
> 
> Its fully understood current is not a dependency platform, so I know
> I'm in caveat emptor. I just want to help track this down, if its
> helpful.
> 
> If I can supply a debug state to somebody to help debug this, let me
> know. It started about 4 days ago, I have re-made/installed world
> multiple times via build.sh since then, keeping myself up-to-date.
> 
> gnumeric can also go to the same place, doing transient window popups.
> 
> I haven't re-made any GTK/GNOME/X related libraries. I am not sure
> where the threads dependency for claws or gnumeric comes in. As yet,
> there is no "parking" going on in Firefox, which is where I believe this
> has previously been seen.
> 
> The LDD state on the binary's direct dependencies tells me this:
> 
> /usr/pkg/lib/libcairo.so.2      	-lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libetpan.so.11     	-lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libgdk-x11-2.0.so.0        -lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libgdk_pixbuf-2.0.so.0     -lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libgthread-2.0.so.0        -lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libgtk-x11-2.0.so.0        -lpthread.0 => /usr/lib/libpthread.so.0
> /usr/pkg/lib/libpangocairo-1.0.so.0     -lpthread.0 => /usr/lib/libpthread.so.0
> 
> do I have to re-make the back-end lib chain to remove something? 
> 
> and a heads-up to anyone else who might be living on the edge: its a
> bit crumbly on the cliff-top at the moment..

More datapoints ...
I have thread-related problems on sparc64. After updating
kernel&userland, named works for a while but tends to wedge after a few
minutes with one thread "parked".
spamassassin (a perl program) either wedges in "pause" state or segvs
like this:

#0  0x0000000040c078f8 in pthread_mutex_lock () from /usr/lib/libpthread.so.0
#1  0x0000000040e4eaa4 in malloc () from /usr/lib/libc.so.12
#2  0x000000004058b8cc in Perl_savesharedpv ()
   from /usr/pkg/lib/perl5/5.8.0/sparc64-netbsd-thread-multi/CORE/libperl.so
#3  0x000000004056b544 in Perl_newSTATEOP ()
[...]

This is a crosscompile from amd64, and I'm not confident about the
stability of that box. I'm now doing a native sparc64 build but it'll
take a while to finish. Btw, the csl-alignment branch merge only
affected the ABI on hp700, right?

-T