Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Killing a zombie process?



    Date:        Fri, 2 Oct 2015 15:26:42 +0800 (PHT)
    From:        Paul Goyette <paul%vps1.whooppee.com@localhost>
    Message-ID:  <Pine.NEB.4.64.1510021516240.2764%vps1.whooppee.com@localhost>

  | 1. Is it correct for init's p_nstopchild to be zero when it has several
  |     children whose p_state is SSTOP?

Depends whether those children have previously been waited for or not.
Stopped children don't go away when they're waited for, so there needs
to be something to prevent wait() returning the same stopped child
over and over again.   That's p_waited ... so you need to check that
value of the stopped children, if it is 0, then something is broken.
If it is 1 (for all of them) then they're irrelevant, and matter not
at all.

  | 2. Is the above code in init correct?  Should we really be leaving the
  |     loop when there are more children to examine?

It is an optimisation, and should be correct.

However, it dpes depend upon p_nstoppedchild being maintained correctly.

You didn't say whether your zombie process is actually to be found
(somewhere) on the parent's (ie: init's) list of children.

I have no idea how one would discover this (at this point, or given
how long you need to wait for it to happen, perhaps ever) but it would
also be interesting to know whether the zombie was reparented to init
before or after it died.

The common case is for a parent to exit, leaving running children, which
are reparented to init, complete, exit, and init cleans them up.

But it is also possible for a child to die, be ignored by its parent,
which later exit itself, leaving the zombie to be reparented to init.
That's more unusual - does not happen very often, but if that is
what happened here, it is possible that there's some bug in the processing
of that case.

kre



Home | Main Index | Thread Index | Old Index