tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sysctl_doeproc() race



On 13.03.2018 09:17, Robert Elz wrote:
> Not a surprise there either - where it might make a difference is when
> the number of processes is more in the thousands.   I'm not sure how
> 20-30 processes counts as a "regular running system" though, my
> laptop has > 130 processes at the minute, and it is more or less idling
> (replying to this e-mail is its current real task) - other times it wil be
> doing a build, perhaps with -j4 or -j8, and be running many more.
> 

I was testing it on a headless desktop. Benchmarking with -j4 or -j8 in
the background couldn't be reliable so the system was idle.

In general I think that refactoring for speed should be done together
with restructuring the locking, structures etc... and when the
bottle-neck could be detected.

>   | I expect an interface like sysctl(3) to get a quick snapshot of the current
>   | system, 
> 
> and I think that's a faulty expectation 

I agree that it's not perfect in the SMP world.

> 
> Are there any other ATF tests that use this sysctl?    Clearly
> nothing else should be affected by altering the way it works...
> 

I was running all the available ATF test for each patch.

I was also testing in parallel a loops like: "while [ 1 ]; do ps aux;
done", "top" and a testcase provided by myself with ptrace(2).

> What I'd suggest is adding one more test, where you create
> processes much like in the ptrace tests, but without any ptrace
> involvement, where the child is simply fork/wait (with random
> small delays - of the order of milliseconds - between those steps)
> and its child simply does an exit after a small random delay,
> with the 1st (direct) sending messages containing the pid
> of the current grandchild to the ATF created process which
> will be in a loop doing continuous sysctl() looking for that
> exact process.
> 

I'm planning to add race* tests in t_ptrace_wait*. I'm encouraged that
we already have tests for races so spending like 30sec on this won't
hurt, given the benefit of regress validation.

A test out of the ptrace(2) context is fine too. I will add it, perhaps
something without sending pid_t messages to make it simpler.

> I also suspect that if you test again now, with the sole change to
> the current code being that the order of the list scans be
> reverted to the way it was (allproc and then zombproc) that
> the issues you were having with the ptrace() tests will all be
> gone.
> 

Yes, however in theory we still can miss a process (in range lookup of
process groups) - during proc_lock everything can happen. I will resist
and prefer the new approach as being less prone to multiple observations.

> kre
> 


Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index