tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
5.1 vs gdb
I've run into an issue with gdb on 5.1, and ktrace leads me to think
it's likely a kernel issue (hence this list). It wouldn't surprise me
too much if I were wrong, though; feel free to point me elsewhere if
appropriate.
The surface manifestation is straightforward:
% cat gdbtest.c
int main(void);
int main(void)
{
return(0);
}
% cc -o gdbtest gdbtest.c -g
% gdb gdbtest
GNU gdb 6.5
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386--netbsdelf"...
(gdb) run
Starting program: /home/mouse/gdbtest
at which point nothing I've tried will wake it up, except for
SIGKILLing gdb from another shell, which produces a "sorry, pid %d was
killed: orphaned traced process" message from the kernel and a "Killed"
from my shell, neither of which is surprising.
ps shows the gdb process, a copy of my shell, and a dead zombie, as in
10467 ttyp7 ZW+ 0:00.00 (linktarget)
10702 ttyp7 I 0:00.01 gdb gdbtest
24466 ttyp7 IX+ 0:00.00 -local/bin/mcsh -c exec /home/mouse/gdbtest
My shell does run linktarget as part of its startup script, so its
presence is not that surprising; its presence as a zombie for more than
the barest moment is what's surprising. Runing gdb under ktrace -i
makes me think the SIGCHLD the shell is wiating for is getting lost:
25022 1 linktarget CALL write(1,0xbb902000,7)
25022 1 linktarget GIO fd 1 wrote 7 bytes
"/local\n"
25022 1 linktarget RET write 7
25022 1 linktarget CALL exit(0)
24312 1 mcsh GIO fd 4 read 7 bytes
"/local\n"
24312 1 mcsh RET read 7
24312 1 mcsh CALL read(4,0xbfbf1c90,0x4000)
24312 1 mcsh GIO fd 4 read 0 bytes
""
24312 1 mcsh RET read 0
24312 1 mcsh CALL close(4)
24312 1 mcsh RET close 0
24312 1 mcsh CALL __sigprocmask14(1,0xbfbf1c50,0xbfbf1c40)
24312 1 mcsh RET __sigprocmask14 0
24312 1 mcsh CALL __sigprocmask14(3,0xbfbf1c40,0)
24312 1 mcsh RET __sigprocmask14 0
24312 1 mcsh CALL __sigprocmask14(1,0xbfbf1bf4,0xbfbf1be4)
24312 1 mcsh RET __sigprocmask14 0
24312 1 mcsh CALL __sigprocmask14(1,0xbfbf1bf4,0)
24312 1 mcsh RET __sigprocmask14 0
24312 1 mcsh CALL __sigsuspend14(0xbfbf1bf4)
10674 1 gdb RET wait4 24312/0x5ef8
10674 1 gdb CALL ptrace(PT_GETREGS,0x5ef8,0xbfbfe19c,0)
10674 1 gdb RET ptrace 0
10674 1 gdb CALL ptrace(PT_CONTINUE,0x5ef8,1,0x14)
10674 1 gdb RET ptrace 0
24312 1 mcsh RET __sigsuspend14 -1 errno 4 Interrupted system call
24312 1 mcsh CALL __sigprocmask14(1,0xbfbf1bf4,0)
24312 1 mcsh RET __sigprocmask14 0
24312 1 mcsh CALL __sigsuspend14(0xbfbf1bf4)
10674 1 gdb CALL wait4(0xffffffff,0xbfbfe408,0,0)
(I SIGKILL gdb at this point)
10674 1 gdb RET wait4 RESTART
10674 1 gdb PSIG SIGKILL SIG_DFL: code=SI_USER sent by pid=14918,
uid=101)
24312 1 mcsh RET __sigsuspend14 -1 errno 4 Interrupted system call
24312 1 mcsh PSIG SIGKILL SIG_DFL: code=SI_NOINFO
The PT_CONTINUE call does make it look as though gdb is doing the right
thing here but signal delivery isn't happening.
Running that mcsh -c exec command under control of ktrace _without_ gdb
being involved produces
25339 1 linktarget CALL write(1,0xbb902000,7)
25339 1 linktarget GIO fd 1 wrote 7 bytes
"/local\n"
25339 1 linktarget RET write 7
25339 1 linktarget CALL exit(0)
25061 1 mcsh GIO fd 4 read 7 bytes
"/local\n"
25061 1 mcsh RET read 7
25061 1 mcsh CALL read(4,0xbfbf1cb0,0x4000)
25061 1 mcsh GIO fd 4 read 0 bytes
""
25061 1 mcsh RET read 0
25061 1 mcsh CALL close(4)
25061 1 mcsh RET close 0
25061 1 mcsh CALL __sigprocmask14(1,0xbfbf1c70,0xbfbf1c60)
25061 1 mcsh RET __sigprocmask14 0
25061 1 mcsh CALL __sigprocmask14(3,0xbfbf1c60,0)
25061 1 mcsh RET __sigprocmask14 0
25061 1 mcsh CALL __sigprocmask14(1,0xbfbf1c14,0xbfbf1c04)
25061 1 mcsh RET __sigprocmask14 0
25061 1 mcsh CALL __sigprocmask14(1,0xbfbf1c14,0)
25061 1 mcsh RET __sigprocmask14 0
25061 1 mcsh CALL __sigsuspend14(0xbfbf1c14)
25061 1 mcsh RET __sigsuspend14 -1 errno 4 Interrupted system call
25061 1 mcsh PSIG SIGCHLD caught handler=0x806a110 mask=(2,20):
code=CLD_EXITED child pid=25339, uid=101, status=0, utime=0, stime=0)
25061 1 mcsh CALL wait4(0xffffffff,0xbfbf1818,3,0xbfbf17d0)
25061 1 mcsh RET wait4 25339/0x62fb
and everything carries on correctly.
So it looks to me as though something's busted somewhere around
PT_CONTINUE and signal delivery, at least in the cas eof SIGCHLD.
Any thoughts?
I have a workaround - "env SHELL=/bin/sh gdb ..." - that presumably
works because I have no startup script for /bin/sh, so it doesn't need
SIGCHLD to work. (In passing, is there an equivalent setting from
within gdb? I haven't found one, but gdb's documentation is remarkably
difficult to use. The most I've found is a variable that says whether
to use a shell, not what shell to use. I tried gdb's environment
setting but that didn't help.)
/~\ The ASCII Mouse
\ / Ribbon Campaign
X Against HTML mouse%rodents-montreal.org@localhost
/ \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Home |
Main Index |
Thread Index |
Old Index