Subject: kern/21690: reaper stuck in anfree state, processes turn into zombies after completion
To: None <gnats-bugs@gnats.netbsd.org>
From: Lubomir Sedlacik <salo@Xtrmntr.org>
List: netbsd-bugs
Date: 05/26/2003 19:55:46
>Number:         21690
>Category:       kern
>Synopsis:       reaper stuck in anfree state, processes turn into zombies after completion
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon May 26 17:56:00 UTC 2003
>Closed-Date:
>Last-Modified:
>Originator:     Lubomir Sedlacik
>Release:        NetBSD 1.6T Sun May 25 14:25:07 CEST 2003
>Organization:
>Environment:
Architecture: i386
Machine: i386
>Description:

after more than 24 hours running, all processes end in dead state after
completion.  reaper kernel thread is stuck in anfree state.

what i was doing when it happened:
- in the middle of compilation of mplayer, pkgsrc tree on nfs (not WRKOBJDIR)
- phoenix 0.6 running
- playing mp3 files with madplay from nfs mounted directory

how it happened:
- phoenix was paged out during the mplayer compilation (using -pipe) unused
  for a while
- i switched back to the virtual desktop with the browser, it didn't reappear
  (empty window frame) -- the common issue with pthreads
- i killed the phoenix with SIGKILL
- next process ended in dead state after completion (ps auxwww | grep pho)

since then it happens for each process, currently:

  90 processes:  56 sleeping, 33 dead, 1 on processor

ps -axl
  UID   PID  PPID CPU PRI NI   VSZ   RSS WCHAN    STAT TT      TIME COMMAND
    0     0     0   0 -18  0     0 21364 schedule DKs  ??   0:00.06 [swapper]
    0     1     0   0  10  0    56     4 wait     SWs  ??   0:00.07 init 
    0     2     0   0  -6  0     0 21364 sccomp   DK   ??   0:00.00 [atapibus0]
    0     3     0   0  10  0     0 21364 usbevt   DK   ??   0:00.01 [usb0]
    0     4     0   0  10  0     0 21364 usbtsk   DK   ??   0:00.00 [usbtask]
    0     5     0   0  10  0     0 21364 pmsreset DK   ??   0:00.00 [pms0]
    0     6     0   1  10  0     0 21364 cardslot DK   ??   0:00.00 [cardslot0]
    0     7     0   0  10  0     0 21364 cardslot DK   ??   0:00.00 [cardslot1]
    0     8     0   0  10  0     0 21364 apmev    DK   ??   0:03.07 [apm0]
    0     9     0   0 -18  0     0 21364 lfswrite DK   ??   0:00.00 [lfs_writer]
    0    10     0   0 -18  0     0 21364 pgdaemon DK   ??   0:02.14 [pagedaemon]
    0    11     0   0 -18  0     0 21364 anfree   DK   ??   0:12.47 [reaper]
    0    12     0   0  18  0     0 21364 syncer   DK   ??   0:41.33 [ioflush]
    0    13     0   0 -18  0     0 21364 aiodoned DK   ??   0:00.36 [aiodoned]
  100    44   668   1 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100   102   668   0   2  4  1600  3604 select   SN   ??  15:19.64 gkrellm --geometry -0-0 --m2 -nc 
  100   163   668   0   2  0  1900  2500 select   S    ??   0:24.09 rxvt 
    0   167   168   0   2  0   520   780 select   Ss   ??   0:05.92 SCREEN (screen-3.9.15)
    0   187     1   0   2  0   184     4 select   SWs  ??   0:00.03 /usr/sbin/syslogd -s 
  100   235     1   1  10  0   148     4 wait     SWs  ??   0:00.01 /bin/sh -c /usr/pkg/bin/esd -terminate -nobeeps -as 2 -spawnfd 35 
  100   236   235   1 -22  0     0     0 -        ZW   ??   0:00.00 (esd)
    0   283     0   0  10  0     0 21364 nfsidl   SK   ??   0:00.22 [nfsio]
    0   312     1   0   2  0    40     4 select   SWs  ??   0:06.02 /usr/sbin/apmd -q 
    0   314     1   0   2  0   280     4 select   IW   ??   0:00.02 /usr/X11R6/bin/xfstt --encoding iso8859-2 iso8859-1 ascii --user nobody 
  100   408   668   0 -22  4     0     0 -        ZWN  ??   0:00.00 (MozillaFirebird-)
    0   410     0   0  10  0     0 21364 nfsidl   SK   ??   0:00.13 [nfsio]
    0   432     1   7   2  0   380     4 select   IWs  ??   0:00.00 /usr/sbin/sshd 
    0   439     1  12  18  0   180     4 pause    IWs  ??   0:00.02 /usr/X11R6/bin/xdm -r /etc/X11/xdm.Xresources -config /etc/X11/xdm.confi
    0   445   439   4  10  0   352     4 wait     IWs  ??   0:00.04 xdm: :0 
    0   465   439   0   2  0 14624 15304 select   Ss   ??   7:28.05 /usr/X11R6/bin/X vt05 -nolisten tcp -auth /usr/X11R6/lib/X11/xdm/authdir
  100   508   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
    0   535     0   0  10  0     0 21364 nfsidl   SK   ??   0:02.41 [nfsio]
32767   537   314   0   2  0 21300     4 netio    SW   ??   0:00.02 /usr/X11R6/bin/xfstt --encoding iso8859-2 iso8859-1 ascii --user nobody 
    0   595     1   0   2  0   248     4 select   SW   ??   0:00.10 xconsole -daemon -notify -verbose -exitOnFail -bd #666677 
  100   668   445   0   2  0   220   740 select   Ss   ??   0:00.61 pwm 
  100   685   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100   754   668   0   2  0  1896  2568 select   S    ??   0:12.63 rxvt 
    0   828     0   0  10  0     0 21364 nfsidl   SK   ??   0:01.32 [nfsio]
  100  1263   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100  4456   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100  4753   668  20 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100 16382   668   0  28  0   588  1808 -        S    ??   0:00.09 rxvt 
  100 23139   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100 26276   668   0   2  0   588  1808 select   S    ??   0:00.36 rxvt 
  100 29016   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100 29349   668   0 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100 29478   668  26 -22  0     0     0 -        ZW   ??   0:00.00 (rxvt)
  100   162   163   0  18  0  1216     4 pause    IWs  p1   0:00.02 zsh 
  100   168   162   0  18  0   356   952 pause    S+   p1   0:00.34 screen (screen-3.9.15)
  100   884   754   0  18  0  1160     4 pause    SWs  p2   0:00.01 zsh 
  100   528   167   0   3  0  1252   556 ttyin    Ss+  p3   0:00.11 /usr/pkg/bin/zsh 
    0   587 20444  32  10  0   276     4 wait     SW+  p4   0:00.02 /usr/pkg/bin/gmake -C mp3lib 
  100   625   167   0  18  0  1228     4 pause    IWs  p4   0:00.02 /usr/pkg/bin/zsh 
    0   755   625   0  18  0  1232     4 pause    SW   p4   0:00.12 zsh 
    0  2740   587  32  10  0   120     4 wait     SW+  p4   0:00.03 /usr/bin/cc -c -O4 -march=i686 -mcpu=i686 -pipe -ffast-math -fomit-frame
    0  9942   755  31  10  0   776     4 wait     SW+  p4   0:00.33 make update 
    0 17045  9942  31  10  0   148     4 wait     SW+  p4   0:00.01 /bin/sh -ec [ ! -s /var/tmp/pkgsrc/audio/libsndfile/work/.DDIR ] || for 
    0 19138 17045  31  10  0   924     4 wait     SW+  p4   0:00.55 make OPSYS NetBSD OS_VERSION 1.6T LOWER_OPSYS netbsd CPU_FLAGS -march=pe
    0 20444 21405  32  10  0   316     4 wait     SW+  p4   0:00.07 /usr/pkg/bin/gmake CCOPTIONS=-march=pentiumpro -f Makefile all 
    0 21405 29363  31  10  0   152     4 wait     SW+  p4   0:00.01 /bin/sh -ec cd /var/tmp/pkgsrc/graphics/mplayer/work/MPlayer-0.90 && /us
    0 24647  2740  36 -22  0     0     0 -        ZW+  p4   0:00.00 (cc1)
    0 24922  2740  18 -22  0     0     0 -        ZW+  p4   0:00.00 (as)
    0 29013 19138  31  10  0   148     4 wait     SW+  p4   0:00.01 /bin/sh -ec cd /cvs/pkgsrc/graphics/mplayer && make  HOST_OSTYPE=NetBSD-
    0 29363 29013  31  10  0   916     4 wait     SW+  p4   0:00.44 make HOST_OSTYPE NetBSD-1.6T-i386 _SRC_TOP_  OPSYS NetBSD OS_VERSION 1.6
  100   449  4512   0  29  0   124   576 -        R+   p5   0:00.01 ps -axl 
  100   686     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (zsh)
  100  1142     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (zsh)
  100  1645     1   2 -22  0     0     0 -        ZW   p5-  0:00.00 (ps)
  100  2682     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (zsh)
  100  4512 16382   0  18  0  1216   960 pause    Ss   p5   0:00.02 zsh 
  100  8490     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (zsh)
  100  9684     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (madplay)
  100 22352     1   1 -22  0     0     0 -        ZW   p5-  0:00.00 (netstat)
  100 26047     1   1 -22  0     0     0 -        ZW   p5-  0:00.00 (ps)
  100 28027     1   0 -22  0     0     0 -        ZW   p5-  0:00.00 (zsh)
  100  1475     1   0 -22  0     0     0 -        ZW   p6-  0:00.00 (zsh)
  100  3172     1   0 -22  0     0     0 -        ZW   p6-  0:00.00 (grep)
  100 20064     1   2 -22  0     0     0 -        ZW   p6-  0:00.00 (ps)
  100  3904     1   2 -22  0     0     0 -        ZW   p7-  0:00.00 (mount)
  100  7316     1   0 -22  0     0     0 -        ZW   p7-  0:00.00 (uname)
  100 17231     1   0 -22  0     0     0 -        ZW   p7-  0:00.00 (zsh)
  100 28266     1   0 -22  0     0     0 -        ZW   p7-  0:00.00 (zsh)
  100  1551     1   0 -22  0     0     0 -        ZW   p8-  0:00.00 (zsh)
  100 15665     1   0 -22  0     0     0 -        ZW   p8-  0:00.00 (top)
  100  5202 26276   0  18  0  1216     4 pause    SWs  p9   0:00.02 zsh 
  100 15688  5202   0   2  0   264   980 select   S+   p9   0:02.36 top 
    0   446     1  14   3  0    48     4 ttyin    IWs+ E0   0:00.01 /usr/libexec/getty Pc ttyE0 
    0   501     1   8   3  0    48     4 ttyin    IWs+ E1   0:00.01 /usr/libexec/getty Pc ttyE1 
    0   530     1   8   3  0    48     4 ttyin    IWs+ E2   0:00.01 /usr/libexec/getty Pc ttyE2 
    0   559     1   8   3  0    48     4 ttyin    IWs+ E3   0:00.01 /usr/libexec/getty Pc ttyE3 
  100  5232     1   1 -22  0     0     0 -        ZW   pA-  0:00.00 (zsh)

as you can see, the phoenix process itself is zombie too:

  100   408   668   0 -22  4     0     0 -        ZWN  ??   0:00.00 (MozillaFirebird-)

>How-To-Repeat:
no idea.  let me know what information should i provide if it happens again.
>Fix:
n/a
>Release-Note:
>Audit-Trail:
>Unformatted: