Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Killing a zombie process?



    Date:        Wed, 30 Sep 2015 18:29:20 +0800 (PHT)
    From:        Paul Goyette <paul%vps1.whooppee.com@localhost>
    Message-ID:  <Pine.NEB.4.64.1509301818410.22549%vps1.whooppee.com@localhost>

  | Well, a quick read through sbin/init.c shows that sometimes it waits 
  | with WNOHANG and sometimes it doesn't.

It is more that init reaps lots of zombie processes, missing just one of
them, occasionally, seems unlikely at best, whatever flags it gives wait().

Far more likely (IMO) is that the process in question is special somehow,
and the most likely special that would cause wait() to fail to see it, is
if the process isn't on init's child process list.   There might be
other possibilities, if the kernel wait code sometimes ignores zombie
processes for some other reason (some other resource still owned, or whatever).

  | Well, for the previous occurrence, I waited many hours, and the zombie 
  | was still there.  (It might even have been as much as a couple of days.)

Of course, it won't be time based where your shutdown just happened to
occur at the magic interval ... rather, shutdown will be causing some
other condition to occur (or be removed) which then allows the zombie
process to complete its transition into full zombiehood (???) and for
init to then clean it.

  | If I get really brave, I might even use gdb to attach to init(8) and see 
  | which of the several waitpid() calls is active.

I think I'd start with the proc structure of the zombie itself, and see
if there's anything unusual about it, see if all the processes resources
(like its kernel stack) have truly been freed already, and if not, just where
that process is sitting.   Since the zombie sits there essentially
forever (it seems) it ought to be fairly easy to check this just using
gdb on /dev/kmem without interrupting normal operations at all (ie: risk free).

On the other hand, checking init's child queue that way would be hard, as it
is in a constant state of churn.

kre



Home | Main Index | Thread Index | Old Index