tech-toolchain archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: LLDB/NetBSD May



On 03.05.2017 23:10, Christos Zoulas wrote:
> On May 3, 10:25pm, n54%gmx.com@localhost (Kamil Rytarowski) wrote:
> | B.
> | I'm still verifying single stepping of LWPs in processes with multiple
> | threads. I have an impression that something is fragile there.
> 
> Ok. Let me know when you have a reproducible problem...
> 

The problem looks similar to PT_RESUME and PT_SUSPEND (per-LWP operations).

With multiple LWPs after creation of a thread followed by raising a
signal for the tracer, a process cannot be singlestepped as one thread
apparently never starts or dies (?) and _lwp_wait() (for reasonable
value of lwpid_t: 2) returns EDEADLK.

_lwp_makecontext()
_lwp_create()
raise(signal)
_lwp_wait()

This is not restricted to PT_SETSTEP, the same happens with PT_STEP.

I will go into this rabbit hole and debug it till squashing the bug.
It will take a while, but getting understanding what's going on is
beneficial (besides profit of just correcting it).

There was filed another report for PT_RESUME... there is tension from
the community:

"Several ptrace_wait test cases fail under DEBUG+LOCKDEBUG"
http://gnats.netbsd.org/52213

> | C.
> | LLDB tests trigger dmesg errors (default GENERIC kernel), there are
> | entries like:
> | fill_vmentry: vp 0xfffffe87288967e8 error 2
> | fill_vmentry: vp 0xfffffe86e1a15930 error 2
> | fill_vmentry: vp 0xfffffe87047f8bd8 error 2
> | fill_vmentry: vp 0xfffffe87051af7e0 error 2
> | fill_vmentry: vp 0xfffffe86ef0b63f0 error 2
> 
> This is DIAGNOSTIC and it is tangentially related to your favorite
> friend (F_GETPATH) :-)
> 
> Let me explain what's wrong here. Getting from a file descriptor
> to a vnode is always a success (if the file descriptor refers to one)
> (vp is the pointer to a vnode here).
> Getting from a vnode to a path is not (here you get 2 ENOENT from
> vnode_to_path):
> 
> 1. The file is removed so there is no path (what I suspect is happening here).
> 2. There are more than one paths and it is not deterministic which one you get
>    (usually does not matter, but it does when you don't have permission to
>     get to the one returned but you have to the other)
> 3. vnode_to_path() uses the reverse-namei cache to do its deed. This can
>    lose in 2 different ways:
> 	- cache eviction: not really an issue unless there is memory pressure
> 	  (still need to handle it, but infrequent).
> 	- path component length... The dreaded NCHNAMLEN (31) constant which
> 	  is the component namelength limit for the current namei cache
> 	  implementation (we should really fix that one day).
> 
> This is why I keep saying forget adding F_GETPATH unless you can make it
> work reliably first :-)
> 

Thank you for the analysis. These reports aren't fatal to the stability
of the system. Once I will sort out the noise from tests, I will have a
closer look at this.

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index