Subject: Re: PostgreSQL taking a *lot* of CPU time.
To: None <darcy@netbsd.org>
From: Berteun Damman <berteun@gmail.com>
List: netbsd-help
Date: 01/15/2005 23:03:43
On Thu, 30 Dec 2004 07:04:10 -0500, D'Arcy J.M. Cain <darcy@netbsd.org> wrote:
> This is really a PostgreSQL question, not a NetBSD one.  
> Note that it is possible that you will eventually be sent back here
> depending on what they find out but they need to be your first stop.

I've been sent back (probably) to the NetBSD list, I've tried a few
times to attach gdb to the spinning process, but backtraces didn't
work, what did work however was killing with -ABRT and using the core
file for a backtrace, which gives:

(gdb) bt
#0  0x483bbb2b in pthread__lock_ras_end () from /usr/lib/libpthread.so.0
#1  0x00000001 in ?? ()
#2  0x483bbc35 in pthread_spinlock () from /usr/lib/libpthread.so.0
#3  0x483b81ae in pthread_sigmask () from /usr/lib/libpthread.so.0
#4  0x08130522 in reaper ()
#5  <signal handler called>
#6  0xbfbf001f in ?? ()
#7  0x0812f488 in ServerLoop ()
#8  0x0812ee67 in PostmasterMain ()
#9  0x08106d3a in main ()
#10 0x0806d682 in ___start ()

ps auxww | grep pgsql showed before something like:
pgsql    15786 94.8  0.3  4380   568 p2 R+    4:13PM   5:13.13
/usr/pkg/bin/postmaster -i -D /usr/pkg/pgsql/data/ (postgres)
pgsql    24309  0.0  0.0  5368     4 p2 IW+   4:13PM   0:00.01
postmaster: stats buffer process    (postgres)
pgsql    25177  0.0  0.0  4420     4 p2 IW+   4:13PM   0:00.01
postmaster: stats collector process    (postgres)
pgsql    29008  0.0  0.0     0     0 p2 ZW+        -   0:00.00 (postgres)

Running attaching gdb to the running process always gives:
#0  0x483bbb2e in pthread__lock_ras_end () from /usr/lib/libpthread.so.0
Error accessing memory address 0x483bbb26: Operation not permitted.

Note that reproducing this is not hard, but it takes quite some time,
because I have to let PostgreSQL run for a while doing nothing, and
this can take 30-120 minutes or so. After inspecting the process with
gdb it always get's killed, so I'll have to restart it again.

My system is NetBSD 2.0 release (but now with my own kernel) and still
with PostgreSQL 7.4.6 from the pkgsrc.

I've discussed it on the PostgreSQL mailing list, mainly with Tom
Lane, and he said:

> Oh, that's interesting.  That does look like it might be a NetBSD bug.
> That says that sigprocmask (or sigsetmask, whichever you have) is
> getting stuck.  Which would be a libc problem not our problem.

Berteun