PT_LWPINFO is a legacy ptrace(2) operation that was originally intended to retrieve the thread (LWP) information inside a traced process. At the end of the day, this call has been designed to work as an iterator over threads and retrieve the LWP id + event information. The event information is received in a raw format (PL_EVENT_NONE, PL_EVENT_SIGNAL, PL_EVENT_SUSPENDED). Problems: 1. PT_LWPINFO shares the operation name with PT_LWPINFO from FreeBSD that works differently and is used for different purposes: - On FreeBSD PT_LWPINFO returns pieces of information for the suspended thread, not the next thread in iteration. - FreeBSD uses a custom interface for iterating over threads (actually retrieving the threads is done with PT_GETNUMLWPS + PT_GETLWPLIST). - There is almost no overlapping correct usage of PT_LPWINFO on NetBSD and PL_LWPINFO FreeBSD and this causes confusion and misuse of the interfaces (recently I fixed such misuse in the DTrace code). 2. pl_event can merely return whether a signal was emitted to all threads or a single one. There is no information whether this is per-LWP signal or per-PROC signal, no siginfo_t information attached etc. 3. Syncing our behavior with FreeBSD would mean complete breakage of our PT_LWPINFO users and it is actually not needed, as we receive full siginfo_t through Linux-like PT_GET_SIGINFO instead of reimplementing siginfo_t inside ptrace_lwpinfo in a FreeBSD-style. (Actually FreeBSD wanted to follow up after NetBSD and adopt some of our APIs in ptrace(2) and signals.). 4. Our PT_LWPINFO is reduced in usability to list LWP ids in a traced process. 5. The PT_LWPINFO semantics cannot be used in core files as-is (as our PT_LPWINFO returns next LWP, not the prompted one) and pl_event is at least redundant with netbsd_elfcore_procinfo.cpi_siglwp... and still less powerful (as it cannot distinguish per-LWP and per-PROC signal in a single-threaded application). 6. PT_LWPINFO is already documented in the BUGS section of ptrace(2)... as it contains more flaws. This is basically the only weak part of our ptrace(2) API. Proposed solution: 1. Remove PT_LWPINFO from the public ptrace(2) API, keep it only as a hidden namespaced symbol for legacy purposes. 2. Introduce PT_LWPSTATUS that is prompts the kernel about exact thread and retrieves useful information about LWP. 3. Introduce PT_LWPNEXT with the iteration semantics from PT_LWPINFO, namely return the next LWP. 4. Ship with per-LWP information in core(5) files as "PT_LWPSTATUS@nnn". 5. Fix flattening the signal context in netbsd_elfcore_procinfo in core(5) files and move per-LWP signal information to per-LWP structure "PT_LWPSTATUS@nnn". 6. Do not bother with FreeBSD like PT_GETNUMLWPS + PT_GETLWPLIST calls, as this is a micro-optimization. We intend to retrieve the list of threads once on attach/exec and later trace them through the LWP events (PTRACE_LWP_CREATE, PTRACE_LWP_EXIT). It's more valuable to keep more compat with current usage of PT_LWPINFO. 7. Keep the existing ATF tests for PT_LWPINFO to avoid rot. PT_LWPSTATUS and PT_LWPNEXT operate over newly introduced "struct ptrace_lwpstatus". This structure is inspired by: - SmartOS lwpstatus_t, - struct ptrace_lwpinfo from NetBSD, - struct ptrace_lwpinfo from FreeBSD and their usage in real existing world-wide open-source software. #define PL_LNAMELEN 20 /* extra 4 for alignment */ struct ptrace_lwpstatus { lwpid_t pl_lwpid; /* LWP described */ sigset_t pl_sigpend; /* LWP signals pending */ sigset_t pl_sigmask; /* LWP signal mask */ char pl_name[PL_LNAMELEN]; /* LWP name, may be empty */ void *pl_private; /* LWP private data */ /* Add fields at the end */ }; - pt_lwpid is picked from PT_LWPINFO. - pl_event is removed entirely as useless, misleading and harmful. - pl_sigpend and pl_sigmask are mainly intended to untangle the cpi_sig* fields from "struct ptrace_lwpstatus" (fix "XXX" in the kernel code). - pl_name is a quick to use API to retrieve the LWP name, replacing sysctl() prompting (previous algorithm: retrieving the number of LWPs, retrieving all LWPs, iterating over them, finding matching id, copying LWP name); pl_name will also ship with the missing LWP name information in core(5) files - pl_private implements currently missing interface to read the TLS base value. In the end I have decided to avoid a write-mode version of PT_LWPSTATUS that rewrites signals, name or private pointer. These options are practically unused in real existing open-source software. There are 2 exceptions that I am familiar with but both are specific to kludges overusing ptrace(2). Once these operations will be really needed, they can be implemented without write-mode version of PT_LWPSTATUS, patching guest's code. Diff fixing the build against the in-sources GDB is as follows: diff --git a/external/gpl3/gdb/dist/gdb/nbsd-nat.c b/external/gpl3/gdb/dist/gdb/nbsd-nat.c index e7a2da1134b3..775ea0a15d82 100644 --- a/external/gpl3/gdb/dist/gdb/nbsd-nat.c +++ b/external/gpl3/gdb/dist/gdb/nbsd-nat.c @@ -145,10 +145,10 @@ nbsd_nat_target::thread_alive (ptid_t ptid) { if (ptid.lwp_p ()) { - struct ptrace_lwpinfo pl; + struct ptrace_lwpstatus pl; pl.pl_lwpid = ptid.lwp (); - if (ptrace (PT_LWPINFO, ptid.pid (), (caddr_t) &pl, sizeof pl) + if (ptrace (PT_LWPSTATUS, ptid.pid (), (caddr_t) &pl, sizeof pl) == -1) return 0; } @@ -255,10 +255,10 @@ static void nbsd_add_threads (pid_t pid) { int val; - struct ptrace_lwpinfo pl; + struct ptrace_lwpstatus pl; pl.pl_lwpid = 0; - while ((val = ptrace (PT_LWPINFO, pid, (void *)&pl, sizeof(pl))) != -1 + while ((val = ptrace (PT_LWPNEXT, pid, (void *)&pl, sizeof(pl))) != -1 && pl.pl_lwpid != 0) { ptid_t ptid = ptid_t (pid, pl.pl_lwpid, 0); This code switches LLDB: http://netbsd.org/~kamil/patch-00208-lldb.txt The following patch http://netbsd.org/~kamil/patch-00207-pt_lwpstatus.2.txt implements: - PT_LWPSTATUS + PT_LWPNEXT - obsoletes PT_LWPINFO - adds ATF regression tests - implements core(5) support for "PT_LWPSTATUS@nnn" - compat32 support - switches GDB to new calls Delayed to be done after merging this into HEAD: man-page update. It's late in the -9 phase of development and it is fine to keep this change for NetBSD-10.
Attachment:
signature.asc
Description: OpenPGP digital signature