tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sysctl_doeproc() race



On 11.03.2018 13:00, Robert Elz wrote:
>     Date:        Sun, 11 Mar 2018 11:06:33 +0100
>     From:        Martin Husemann <martin%duskware.de@localhost>
>     Message-ID:  <20180311100633.GE23416%mail.duskware.de@localhost>
> 
>   | I don't get this part - how would we end up with the new process using
>   | the same pid?
> 
> From what I can see from glancing at the code, the issue is an attempt to
> monitor an unrelated process - one that is neither a child, nor being ptrace'd.
> 
> That process can exit, its zombie be cleaned up, and then a new process
> created which happens to have the same pid as the previous one had
> (most of this is intended to happen quite quickly, but there's no guarantee
> of that - the process doing the monitoring could be suspended and wake up
> days after the process it was looking for vanished).
> 
> All this is inherantly unreliable, and nothing that is done, beyond adding a 
> whole new mechanism to hold a process that some other process has an
> interest it, will ever fix it.
> 

POSIX people told me that polling of a process entity is reliable only
for parent (whether a real one or a tracer in non-POSIX extensions).

> Kamil:   What I don't understand is how you were ever getting the process
> returned twice?   You're using the sysctl to look for a specific pid right.?
> 

Yes.

> When that pid is found, the sysctl code should simply copy out the relevant
> datea, and return.   There's no point searching further in the lists, one pid
> can only exist once at a time - once found it is found....
> 

We can use pid lookup in this particular case calling proc_find_raw().

I treat this as just an optimization (something nice to have, but not
important now).

This will not solve the bug for other sysctl_doeproc() cases and a fix
is still needed.

> If that is not the way the sysctl lookup code is working then we should
> probably fix it.   There cannot be 2 processes with pid N at any one
> instant, so looking for a specific pid should only ever be able to return 1
> (or "not found" of course).
> 

Correct.

> It is possible to not find it, depending on what kind of locking the finding
> code is doing (for this, just being an "observation" interface, I'd assume the
> minimum possible) even though it exists, if the lists are changing underneath
> the search - but given the nature of what happens to a process, a search
> of zombproc, allproc, zombproc (stopping when found) will either find the 
> process or the process does not exist - and possibly just allproc followed
> by zombproc searches would work as well.
> 

We already use markers so I prefer to stick to this solution and it
makes this code reliable.

I was testing the reproducer for the patch of mine and after a long time
of concurrent execution of the tests I have not observed a single crash.
There are also no ATF regressions observed.

For the completeness I'm going to test the approach proposed by Christos
whether it is stable, and I plan to commit the patch of mine as a
personal preference.

> kre
> 


Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index