tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sysctl_doeproc() race



    Date:        Tue, 13 Mar 2018 12:04:42 +0100
    From:        Kamil Rytarowski <n54%gmx.com@localhost>
    Message-ID:  <5be4edf8-4722-e586-ea3e-0021862a400a%gmx.com@localhost>

  | I was testing it on a headless desktop. Benchmarking with -j4 or -j8 in
  | the background couldn't be reliable so the system was idle.

"Not reliable" in the sense that two results aren't directly comparible, no,
of course not, but like any other real world measurement, collecting enough
results while controlling just one variable allows conclusions to be drawn
about the effects of that variable, regardless of what else is happening
(provided it isn't being deliberately controlled as well.)    But I'm no
statistician, so exactly what needs to be done is beyond my competence.

  | In general I think that refactoring for speed should be done together [...]

Sure - I was not suggesting that anything shouild be done to this code to
make it run faster - I had seen that "could just use ..." comment ages ago,
and (as I assume its original author did, even more ages previously) that
it just wasn't worth the effort for this interface - typically nothing is 
going to care how fast it is.

  | I was running all the available ATF test for each patch.

Yes, understood - and a good idea - the point was that if changing just
this function (ie: affecting just sysctl of proc...) affected most of the other
tests it would indicate that something was very broken indeed.
You add
	KASSERT(false);
into the sysctl code for proc and the tests (except yours I suspect) would
never notice...

  | I was also testing in parallel a loops like: "while [ 1 ]; do ps aux;
  | done", "top" and a testcase provided by myself with ptrace(2).

Perhaps a better parallel operation would be somehing like

	while :
	do cat
	done </dev/null

(ie: just creating lots of processes that go away, very quickly.)
Run two of three of those in parallel, while the ptrace (and other)
tests are running.    I'm not expecting any issues from that, but
it is a better test that just running ps in parallel.

  | Yes, however in theory we still can miss a process (in range lookup of
  | process groups) - during proc_lock everything can happen.

Yes, the variants for other than KERN_PROC_PID are even more
likely to miss things - but aside from processes turning into zombies,
all that depends upon the thing we are looking for changing during
the scan - and that's a true race, which the sysctl code has no
defined right to win.    In the "turn into zombies" case, nothing is
different, and again, I think I'd generally prefer to see such processes
twice rather than not see them at all (though I agree once would be
better, but making that happen either requires freezing updates, which
would not be a good idea, or adding a new "processed" flag that would
need to be set in each process (with whatever locking needed that
would require) and then cleaned again after the sysctl loop finishes
(or cleaned in a separate loop before the main loop starts.)
I don't think that's worth it - the code doing all these other scans
already expects multiple results, but has no real idea how many
there will be, so getting one or two "extra" thrown in should not
bother it.   Where it matters (which will not be often I suspect) it
just needs to deal with the duplicates.

kre



Home | Main Index | Thread Index | Old Index