tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: sysctl_doeproc() race



On 11.03.2018 17:25, Robert Elz wrote:
>     Date:        Sun, 11 Mar 2018 16:02:57 +0100
>     From:        Kamil Rytarowski <n54%gmx.com@localhost>
>     Message-ID:  <10b0efe2-35bb-d1e8-ed28-587f643bdfd7%gmx.com@localhost>
> 
>   | POSIX people told me that polling of a process entity is reliable only
>   | for parent (whether a real one or a tracer in non-POSIX extensions).
> 
> I agree with that - what's more, that is all it reasonably can be.
> 
>   | We can use pid lookup in this particular case calling proc_find_raw().
> 
> Yes, there's an XXX comment in the code already, but I agree:
> 
>   | I treat this as just an optimization (something nice to have, but not
>   | important now).
> 
> An alternative would be to simply break out of the loop after having
> found the process in the case of  (op == KERN_PROC_PID)
> 
>   | This will not solve the bug for other sysctl_doeproc() cases and a fix
>   | is still needed.
> 
> Is it really?    Those cases won't have the issue you saw of there
> being insufficient buffer space (or the code will handle that already)
> as the number of processes that match could be anything, the buffer
> will not (should not) have space for just one (unlike the KERN_PROC_PID
> case where only 1 should be possible).
> 

I consider the status of two processes (alive and zombie) returned at
once as a bug and non-logical snapshot of the list of processes. POSIX
specifies lifetime of a process and it's either alive or zombie, not
both. There is a lack of specification what happens in between:

Process:
"A live process (see Live Process) or a zombie process (see Zombie
Process )."

I expect an interface like sysctl(3) to get a quick snapshot of the
current system, and an observation of 0 or 1 makes sense to me, while 2
does not. We are not cloning a process and making the clone a zomie, and
killing the living one.

We could get a list of duplicates for KERN_PROC_PGRP, KERN_PROC_SESSION
etc. I don't think that KERN_PROC_PID is fundamentally different.

Assuming that a process died and is invisible, before becoming a zombie
is a good [to me] metaphor.

We could also return ESRCH for 0 detected results to make handling of
this case easier and repeat the polling function.

Regarding short-circuit, if we want to go this route now, I propose to
go immediately to proc_find_raw() and make it O(1) instead of O(n).

I will prepare a patch once we will agree whether to return 0-1 or 1-2
entries for a single process.

>   | We already use markers so I prefer to stick to this solution and it
>   | makes this code reliable.
> 
> I don't much like the way you LIST_REMOVE(marker0) - if we add almost
> any fix to short circuit the complete list scan after having found the proc
> entry for the KERN_PROC_PID case your current method will fail.
> 
> Better to simply remove it after the loop ends (no matter why it ends),
> rather than when it is used (and then also do it in the cleanup code
> always, rather than only when !mmmbrains).   That's simpler, and easier
> to understand.
> 

OK!

Attachment: signature.asc
Description: OpenPGP digital signature



Home | Main Index | Thread Index | Old Index