tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Introducing PT_LWPSTATUS + PT_LWPNEXT, obsoleting PT_LWPINFO



PT_LWPINFO is a legacy ptrace(2) operation that was originally intended
to retrieve the thread (LWP) information inside a traced process.

At the end of the day, this call has been designed to work as an
iterator over threads and retrieve the LWP id + event information. The
event information is received in a raw format (PL_EVENT_NONE,
PL_EVENT_SIGNAL, PL_EVENT_SUSPENDED).

Problems:

1. PT_LWPINFO shares the operation name with PT_LWPINFO from FreeBSD
that works differently and is used for different purposes:

 - On FreeBSD PT_LWPINFO returns pieces of information for the suspended
thread, not the next thread in iteration.

 - FreeBSD uses a custom interface for iterating over threads (actually
retrieving the threads is done with PT_GETNUMLWPS + PT_GETLWPLIST).

 - There is almost no overlapping correct usage of PT_LPWINFO on NetBSD
and PL_LWPINFO FreeBSD and this causes confusion and misuse of the
interfaces (recently I fixed such misuse in the DTrace code).

2. pl_event can merely return whether a signal was emitted to all
threads or a single one. There is no information whether this is per-LWP
signal or per-PROC signal, no siginfo_t information attached etc.

3. Syncing our behavior with FreeBSD would mean complete breakage of our
PT_LWPINFO users and it is actually not needed, as we receive full
siginfo_t through Linux-like PT_GET_SIGINFO instead of reimplementing
siginfo_t inside ptrace_lwpinfo in a FreeBSD-style. (Actually FreeBSD
wanted to follow up after NetBSD and adopt some of our APIs in ptrace(2)
and signals.).

4. Our PT_LWPINFO is reduced in usability to list LWP ids in a traced
process.

5. The PT_LWPINFO semantics cannot be used in core files as-is (as our
PT_LPWINFO returns next LWP, not the prompted one) and pl_event is at
least redundant with netbsd_elfcore_procinfo.cpi_siglwp... and still
less powerful (as it cannot distinguish per-LWP and per-PROC signal in a
single-threaded application).

6. PT_LWPINFO is already documented in the BUGS section of ptrace(2)...
as it contains more flaws. This is basically the only weak part of our
ptrace(2) API.



Proposed solution:

1. Remove PT_LWPINFO from the public ptrace(2) API, keep it only as a
hidden namespaced symbol for legacy purposes.

2. Introduce PT_LWPSTATUS that is prompts the kernel about exact thread
and retrieves useful information about LWP.

3. Introduce PT_LWPNEXT with the iteration semantics from PT_LWPINFO,
namely return the next LWP.

4. Ship with per-LWP information in core(5) files as "PT_LWPSTATUS@nnn".

5. Fix flattening the signal context in netbsd_elfcore_procinfo in
core(5) files and move per-LWP signal information to per-LWP structure
"PT_LWPSTATUS@nnn".

6. Do not bother with FreeBSD like PT_GETNUMLWPS + PT_GETLWPLIST calls,
as this is a micro-optimization. We intend to retrieve the list of
threads once on attach/exec and later trace them through the LWP events
(PTRACE_LWP_CREATE, PTRACE_LWP_EXIT). It's more valuable to keep more
compat with current usage of PT_LWPINFO.

7. Keep the existing ATF tests for PT_LWPINFO to avoid rot.


PT_LWPSTATUS and PT_LWPNEXT operate over newly introduced "struct
ptrace_lwpstatus". This structure is inspired by:
 - SmartOS lwpstatus_t,
 - struct ptrace_lwpinfo from NetBSD,
 - struct ptrace_lwpinfo from FreeBSD

and their usage in real existing world-wide open-source software.


#define PL_LNAMELEN	20	/* extra 4 for alignment */

struct ptrace_lwpstatus {
	lwpid_t		pl_lwpid;		/* LWP described */
	sigset_t	pl_sigpend;		/* LWP signals pending */
	sigset_t	pl_sigmask;		/* LWP signal mask */
	char		pl_name[PL_LNAMELEN];	/* LWP name, may be empty */
	void		*pl_private;		/* LWP private data */
	/* Add fields at the end */
};


 - pt_lwpid is picked from PT_LWPINFO.

 - pl_event is removed entirely as useless, misleading and harmful.

 - pl_sigpend and pl_sigmask are mainly intended to untangle the
cpi_sig* fields from "struct ptrace_lwpstatus" (fix "XXX" in the kernel
code).

 - pl_name is a quick to use API to retrieve the LWP name, replacing
sysctl() prompting (previous algorithm: retrieving the number of LWPs,
retrieving all LWPs, iterating over them, finding matching id, copying
LWP name); pl_name will also ship with the missing LWP name information
in core(5) files

 - pl_private implements currently missing interface to read the TLS
base value.

In the end I have decided to avoid a write-mode version of PT_LWPSTATUS
that rewrites signals, name or private pointer. These options are
practically unused in real existing open-source software. There are 2
exceptions that I am familiar with but both are specific to kludges
overusing ptrace(2). Once these operations will be really needed, they
can be implemented without write-mode version of PT_LWPSTATUS, patching
guest's code.


Diff fixing the build against the in-sources GDB is as follows:

diff --git a/external/gpl3/gdb/dist/gdb/nbsd-nat.c
b/external/gpl3/gdb/dist/gdb/nbsd-nat.c
index e7a2da1134b3..775ea0a15d82 100644
--- a/external/gpl3/gdb/dist/gdb/nbsd-nat.c
+++ b/external/gpl3/gdb/dist/gdb/nbsd-nat.c
@@ -145,10 +145,10 @@ nbsd_nat_target::thread_alive (ptid_t ptid)
 {
   if (ptid.lwp_p ())
     {
-      struct ptrace_lwpinfo pl;
+      struct ptrace_lwpstatus pl;

       pl.pl_lwpid = ptid.lwp ();
-      if (ptrace (PT_LWPINFO, ptid.pid (), (caddr_t) &pl, sizeof pl)
+      if (ptrace (PT_LWPSTATUS, ptid.pid (), (caddr_t) &pl, sizeof pl)
 	  == -1)
 	return 0;
     }
@@ -255,10 +255,10 @@ static void
 nbsd_add_threads (pid_t pid)
 {
   int val;
-  struct ptrace_lwpinfo pl;
+  struct ptrace_lwpstatus pl;

   pl.pl_lwpid = 0;
-  while ((val = ptrace (PT_LWPINFO, pid, (void *)&pl, sizeof(pl))) != -1
+  while ((val = ptrace (PT_LWPNEXT, pid, (void *)&pl, sizeof(pl))) != -1
     && pl.pl_lwpid != 0)
     {
       ptid_t ptid = ptid_t (pid, pl.pl_lwpid, 0);


This code switches LLDB: http://netbsd.org/~kamil/patch-00208-lldb.txt


The following patch
http://netbsd.org/~kamil/patch-00207-pt_lwpstatus.2.txt implements:

 - PT_LWPSTATUS + PT_LWPNEXT
 - obsoletes PT_LWPINFO
 - adds ATF regression tests
 - implements core(5) support for "PT_LWPSTATUS@nnn"
 - compat32 support
 - switches GDB to new calls

Delayed to be done after merging this into HEAD: man-page update.

It's late in the -9 phase of development and it is fine to keep this
change for NetBSD-10.

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index