tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ptrace(2) interface for hardware watchpoints (breakpoints)



On Thu, Dec 15, 2016 at 19:51:35 +0100, Kamil Rytarowski wrote:

> On 15.12.2016 16:42, Valery Ushakov wrote:
> > Again, you don't provide any details.  What extra logic?  Also, what
> > are these few dozens of instructions you are talking about?  I.e. what
> > is that extra work you have to do for a process-wide watchpoint that
> > you don't have to do for an lwp-specific watchpoint on each return to
> > userland?
> 
> 1. Complexity is adding extra case in ptrace_watchpoint structure,
> adding there a way to specify per-thread or per-process. If there
> someone wants to set per-thread watchpoints inside the process
> structure.. there would be need to have a list of available watchpoints,
> that would scale to number of watchpoints possible x number of threads list.
> 
> 2. Complexity on returning to userland - need to lock structure process
> in userret(9) and check every watchpoint if it's process-wide or
> dedicated for the thread.

Why would you need all this?  Consider the case when debug registers
are part of the mcontext, then the very act of restoring the context
enables corresponding watchpoints for the lwp.  When the debug
registers are not part of mcontext the only difference is that after
restoring the mcontext you also set debug registers from some other
structure.

E.g. sh3 uses User Break Controller to implement single-stepping, so
effectively a kind of watchpoint that is triggered after instruction,
not matching any address bits, asid, etc, etc.  The register in UBC
that enables the watchpoint is set from a field in trapframe, just
like any other register.

So at ptrace(2) time to set a process-wide watchpoint, you go over all
existing lwps and setup their trapframes accordingly.  For new lwps
created after the watchpoint is set you need to do that at lwp
creation time.  But when lwp returns to userland, there's no overhead.


> I implemented it originally per process and I finally decided to throw
> the per-process vs per-thread logic away, out of the kernel and expose
> watchpoints (or technically bitmasks of available debug registers) to
> userland.
> 
> It's easier to check perlwp local structure and end up with up to 4
> fields there, than lock a list and iterate over N elements. Every thread
> has also dedicated bit in its property indicating whether it has
> attached watchpoints.
> 
> From user-land point of view, and management it's equivalent. With the
> difference that debugger needs to catch thread creation and apply
> desired watchpoint to it.
> 
> Why bitmasks and not raw registers? On some level there is need to check
> if the composed combination is valid in the kernel - dividing
> user-settable bits from registers to bitmask is needed on some level
> anyway, and while it's possible to be done in kernel, why not to export
> it to userland?
> 
> I've found it easier to be reused in 3rd party software.

-uwe


Home | Main Index | Thread Index | Old Index