Subject: Re: kern/35224: kernel hangs in mclpl after heavy net load in the sparc64 port (eventually also other ports)
To: Martin Husemann <martin@duskware.de>
From: Stephan Pietzko <stephan.pietzko@uni-konstanz.de>
List: netbsd-bugs
Date: 12/10/2006 08:22:11
Martin Husemann <martin@duskware.de> wrote
> 
> This sounds like a mbuf leak. Check netstat output, maybe you can spot where
> the mbuf are lingering.
> 
> > the machine is crashing once a day
> Is this related? If not, please file a separate PR.

Nope - this is related.

> What kind of crash is it? A panic should print a message before rebooting, we
> need at least that to even start thinking about this.

Sorry, i meant: The daemon is freezing in the mclpl-status and cause
of that i have to reboot. This happens every some days or sometimes
serveral times a day. I called this 'the machine is crashing once a
day', but it is still the mclpl-problem.

I have several outputs from lsof, ps, top from this problem:
----------------------------------------------------------------------
root@nepal:/root> grep -in mclpl *.txt
050206top.txt:9: 25980 www      -22    0  2224K  169M mclpl    237:43 0.00%  0.00% <thttpd> 
110706mclpl_top.txt:9: 4011 www      -22    0  3840K   11M mclpl 54.8H  0.00%  0.00% <lighttpd> 
170506mclpl_top.txt:10: 11711 www      -22    0  2008K   13M mclpl 188:25  0.00%  0.00% <lighttpd>
270406mclpl_top.txt:9: 29157 www      -22    0  4424K   25M mclpl 22.4H  0.00%  0.00% <lighttpd>
grr.txt:9: 370 www      -22    0  2368K   25M mclpl    283:54  0.00% 0.00% <lighttpd>
170506mclpl_ps.txt:21: www  11711  0.0  0.0 2008 12984 ?      DW Sat11PM 188:25.32 /usr/pk 500 11711     1  13 -22  0 2008 12984 mclpl DW   ?      188:25.32 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf 
270406mclpl_ps.txt:22: www  29157  0.0  0.0 4424 25904 ?      DW 16Apr06 1342:15.24 /usr/pk 500 29157     1   6 -22  0 4424 25904 mclpl DW   ?      1342:15.24 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf 
050206ps.txt:19: www  25980  0.0  0.0 2224 173248 ?      DWs  Fri05PM 237:43.33 /usr/pk 500 25980     1   8 -22  0 2224 173248 mclpl    DWs ?      237:43.33 /usr/pkg/sbin/thttpd -C /usr/pkg/etc/thttpd.conf 
110706mclpl_ps.txt:18:500  4011     1   3 -22  0 3840 11088 mclpl DW   ?      3287:04.29 /usr/pkg/sbin/lighttpd -f /usr/pkg/etc/lighttpd/lighttpd.conf 
----------------------------------------------------------------------
i just grepped the httpd-lines out of a top- or ps-output during that
situation. Donno if this helps anything.
I will try pavels idea 
'netstat -mssv with a kernel built with "options MBUFTRACE"' 
as next step.

tnx Stephan Pietzko