Re: ptrace(2) interface for hardware watchpoints (breakpoints)

To: tech-kern%netbsd.org@localhost
Subject: Re: ptrace(2) interface for hardware watchpoints (breakpoints)
From: Valery Ushakov <uwe%stderr.spb.ru@localhost>
Date: Thu, 15 Dec 2016 23:15:42 +0300

On Thu, Dec 15, 2016 at 19:51:35 +0100, Kamil Rytarowski wrote:

> On 15.12.2016 16:42, Valery Ushakov wrote:
> > Again, you don't provide any details.  What extra logic?  Also, what
> > are these few dozens of instructions you are talking about?  I.e. what
> > is that extra work you have to do for a process-wide watchpoint that
> > you don't have to do for an lwp-specific watchpoint on each return to
> > userland?
> 
> 1. Complexity is adding extra case in ptrace_watchpoint structure,
> adding there a way to specify per-thread or per-process. If there
> someone wants to set per-thread watchpoints inside the process
> structure.. there would be need to have a list of available watchpoints,
> that would scale to number of watchpoints possible x number of threads list.
> 
> 2. Complexity on returning to userland - need to lock structure process
> in userret(9) and check every watchpoint if it's process-wide or
> dedicated for the thread.

Why would you need all this?  Consider the case when debug registers
are part of the mcontext, then the very act of restoring the context
enables corresponding watchpoints for the lwp.  When the debug
registers are not part of mcontext the only difference is that after
restoring the mcontext you also set debug registers from some other
structure.

E.g. sh3 uses User Break Controller to implement single-stepping, so
effectively a kind of watchpoint that is triggered after instruction,
not matching any address bits, asid, etc, etc.  The register in UBC
that enables the watchpoint is set from a field in trapframe, just
like any other register.

So at ptrace(2) time to set a process-wide watchpoint, you go over all
existing lwps and setup their trapframes accordingly.  For new lwps
created after the watchpoint is set you need to do that at lwp
creation time.  But when lwp returns to userland, there's no overhead.

> I implemented it originally per process and I finally decided to throw
> the per-process vs per-thread logic away, out of the kernel and expose
> watchpoints (or technically bitmasks of available debug registers) to
> userland.
> 
> It's easier to check perlwp local structure and end up with up to 4
> fields there, than lock a list and iterate over N elements. Every thread
> has also dedicated bit in its property indicating whether it has
> attached watchpoints.
> 
> From user-land point of view, and management it's equivalent. With the
> difference that debugger needs to catch thread creation and apply
> desired watchpoint to it.
> 
> Why bitmasks and not raw registers? On some level there is need to check
> if the composed combination is valid in the kernel - dividing
> user-settable bits from registers to bitmask is needed on some level
> anyway, and while it's possible to be done in kernel, why not to export
> it to userland?
> 
> I've found it easier to be reused in 3rd party software.

-uwe

References:
- ptrace(2) interface for hardware watchpoints (breakpoints)
  - From: Kamil Rytarowski
- Re: ptrace(2) interface for hardware watchpoints (breakpoints)
  - From: Valery Ushakov
- Re: ptrace(2) interface for hardware watchpoints (breakpoints)
  - From: Kamil Rytarowski
- Re: ptrace(2) interface for hardware watchpoints (breakpoints)
  - From: Valery Ushakov
- Re: ptrace(2) interface for hardware watchpoints (breakpoints)
  - From: Kamil Rytarowski

Prev by Date: Re: ptrace(2) interface for hardware watchpoints (breakpoints)
Next by Date: Re: 6.1/amd64 NFS/UFS-related panic
Previous by Thread: Re: ptrace(2) interface for hardware watchpoints (breakpoints)
Next by Thread: Re: ptrace(2) interface for hardware watchpoints (breakpoints)
Indexes:

Home | Main Index | Thread Index | Old Index