Subject: Re: Fixing GDB
To: Nathan J. Williams <nathanw@wasabisystems.com>
From: Rui Paulo <rpaulo@NetBSD.org>
List: tech-userlevel
Date: 07/25/2005 22:47:14
On 2005.07.25 17:16:33 +0000, Nathan J. Williams wrote:
| Rui Paulo <rpaulo@NetBSD.org> writes:
| 
| > Currently, our gdb is unable to trace/debug pthreaded applications.
| > Debugging some application is a task often performed, mainly on a
| > operating systems used on researching, such as NetBSD.
| > 
| > It doesn't matter if the program uses pthread_* calls or not,
| > it stops on errors from the pthread shared library. The only way to debug,
| > a pthreaded application is by statically link it, which is sub-optimal.
| > 
| > I plan to take a look at gdb code, but since it's the first time I'm going
| > to look at it, I'm not expecting to come up with a fix right now.
| 
| The issues here are largely initialization-order ones regarding the
| way that the thread target (nbsd-thread.c) decides that an application
| is threaded and interposes the thread-aware debugging routines. To
| start a program, GDB calls a "start_inferior()" routine. During the
| startup, various shared libraries are loaded, and each of those
| triggers an event. The nbsd-thread.c watches for the loading of
| libpthread to activate things, but getting the point of activation
| correct is difficult, since it has to work for all combinations of
| static and dynamic linking with starting an inferior process, attaching
| to an existing process, and reading a core file. Mixing in the thread
| routines too late will result in various spurious traps as the
| non-thread-aware code; mixing it in too early will cause problems as
| the thread library isn't sufficently initialized to cope with the
| libpthread_dbg operations yet.

Hmm, so debugging a statically linked process works or not ?
proton% gdb -q ./pthread                                                    [~]
(no debugging symbols found)...(gdb) r
Starting program: /home/rpaulo/pthread 
^C
Program received signal SIGINT, Interrupt.
0x0804832c in routine ()
(gdb) The program is running.  Exit anyway? (y or n) y

proton% file ./pthread                                                      [~]
./pthread: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for NetBSD 3.99.7, statically linked, not stripped

This program creates another thread of exectution and lets the main process
waiting using pthread_join().

Shouldn't gdb print something like "Switching to LWP x" ?

Also, attaching to an already running process seems to work.
proton% ./pthread &                                                         [~]
[1] 7335
proton% gdb -q ./pthread 7335                                               [~]
(no debugging symbols found)...Attaching to program: /home/rpaulo/pthread, process 7335
0x0804832c in routine ()
(gdb) 

| It's a great ball of wax, lemme tell you.

Seems so.

| (Also, it would be nice to use a more modern version of GDB. Versions
| after 6.1 require some restructuring of the thread code to adapt to
| major changes in the register-fetching interface, which I'm working on
| in my Copious Free Time).

By importing gdb 6.x, can we make it easier to fix this problem?

Thanks,
		-- Rui Paulo