NetBSD-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Random lockups on an email server - possibly kern/50168
I have a server farm at my small ISP running NetBSD 7.0 and pf.  All
the servers seem to be rock solid except the email server which has
random lockups.  The system is still running as it responds to pings
and in fact if I am running screen I can switch between the different
screens but none of them will run anything and even a simple carriage
return will not display a new prompt.
It sounds like kern/50168 (Frequent lockups and panics with NetBSD
7/amd64, may be ipfilter-related) but I run pf, not ipf.
I have a little script that capture memory usage every minute and
stores it in a log.  It writes the time followed by MemTotal, MemFree,
MemShared, SwapTotal, SwapFree, Cached and Buffers from /proc/meminfo.
Here's what it looked like when it hung and was rebooted.
Wed Mar 16 13:39:00 2016  31806    721      0  32787  32787  27565  25744
Wed Mar 16 13:40:00 2016  31806    718      0  32787  32787  27568  25744
Wed Mar 16 13:41:00 2016  31806    739      0  32787  32787  27549  25746
Wed Mar 16 13:42:00 2016  31806    733      0  32787  32787  27555  25748
Wed Mar 16 13:43:00 2016  31806    763      0  32787  32787  27528  25754
Wed Mar 16 13:44:00 2016  31806    720      0  32787  32787  27568  25754
Wed Mar 16 13:45:00 2016  31806    696      0  32787  32787  27591  25756
Wed Mar 16 13:46:00 2016  31806    718      0  32787  32787  27569  25755
Wed Mar 16 13:47:00 2016  31806    721      0  32787  32787  27566  25752
Wed Mar 16 13:48:01 2016  31806    736      0  32787  32787  27552  25756
Wed Mar 16 13:49:00 2016  31806    794      0  32787  32787  27497  25756
Wed Mar 16 13:50:00 2016  31806    819      0  32787  32787  27471  25755
Wed Mar 16 13:51:00 2016  31806    834      0  32787  32787  27457  25754
Wed Mar 16 13:52:00 2016  31806    830      0  32787  32787  27461  25754
Wed Mar 16 13:53:00 2016  31806    836      0  32787  32787  27456  25754
Wed Mar 16 13:54:00 2016  31806    842      0  32787  32787  27450  25755
Wed Mar 16 13:55:01 2016  31806    827      0  32787  32787  27465  25754
Wed Mar 16 13:56:00 2016  31806     66      0  32787  32787  28227  26540
Wed Mar 16 13:57:00 2016  31806     83      0  32787  32787  28207  26476
Wed Mar 16 13:58:00 2016  31806     48      0  32787  32787  28243  26479
Wed Mar 16 13:59:00 2016  31806     75      0  32787  32787  28215  26480
Wed Mar 16 14:36:01 2016  31806  31067      0  32787  32787    475     98
Wed Mar 16 14:37:00 2016  31806  30733      0  32787  32787    745    135
Wed Mar 16 14:38:00 2016  31806  30644      0  32787  32787    821    163
Wed Mar 16 14:39:00 2016  31806  30542      0  32787  32787    915    187
Wed Mar 16 14:40:00 2016  31806  30450      0  32787  32787    993    211
I could turn off pf but it could be weeks before a hang might happen.
I am considering rebooting on a regular basis (early Sunday morning is
what I had in mind) to see if that makes it more reliable but I have no
indication that this is uptime related.
I also have a "top -osize" running in one of the screens.  Since I can
still switch screens I am hoping that that might show me the culprit if
it is a runaway process.
Can anyone suggest any other avenues to investigate?
-- 
D'Arcy J.M. Cain <darcy%NetBSD.org@localhost>
http://www.NetBSD.org/ IM:darcy%Vex.Net@localhost
Home |
Main Index |
Thread Index |
Old Index