Subject: Kernel buffer tracking
To: NetBSD Users's Discussion List <netbsd-users@netbsd.org>
From: Peter Eisch <peter@boku.net>
List: netbsd-users
Date: 04/27/2007 16:45:03
On my netbsd-3-1 system where it seems the kernel is consuming all buffer
space.  The only thing special about this system is that it's using
FAST_IPSEC for various reasons.  The system has to be rebooted weekly and
will lock up if it isn't rebooted (via serial console) quickly after it
exhausts resources.

I'm able to monitor the system for when it's about to lock up as sshd will
stop presenting the SSH banner on tcp:22.  This recent time I jumped on and
tried restarting named and then ipsec:

sysname# /etc/rc.d/named9 start
Starting named.
Apr 27 16:04:29 sysname named[5201]: starting BIND 9.3.4
Apr 27 16:04:29 sysname named[5201]: net.c:70: unexpected error:
Apr 27 16:04:29 sysname named[5201]: socket() failed: No buffer space
available
Apr 27 16:04:29 sysname named[5201]: net.c:70: unexpected error:
Apr 27 16:04:29 sysname named[5201]: socket() failed: No buffer space
available
Apr 27 16:04:29 sysname named[5201]: not listening on any interfaces
Apr 27 16:04:29 sysname named[5201]: server.c:902: unexpected error:
Apr 27 16:04:29 sysname named[5201]: unable to obtain neither an IPv4 nor an
IPv6 dispatch
Apr 27 16:04:29 sysname named[5201]: loading configuration: unexpected error
Apr 27 16:04:29 sysname named[5201]: exiting (due to fatal error)
sysname# /etc/rc.d/racoon stop
Stopping racoon.
Waiting for PIDS: 287.
sysname# /etc/rc.d/ipsec stop
Clearing ipsec manual keys/policies.
sysname# /etc/rc.d/ipsec start
Installing ipsec manual keys/policies.
The result of line 4: No buffer space available.
The result of line 6: No buffer space available.
The result of line 11: No buffer space available.
The result of line 13: No buffer space available.
The result of line 18: No buffer space available.
The result of line 20: No buffer space available.
The result of line 25: No buffer space available.
The result of line 27: No buffer space available.
The result of line 33: No buffer space available.
The result of line 35: No buffer space available.
The result of line 41: No buffer space available.
The result of line 43: No buffer space available.
The result of line 49: No buffer space available.
The result of line 51: No buffer space available.
The result of line 54: No buffer space available.
The result of line 56: No buffer space available.
The result of line 62: No buffer space available.
The result of line 64: No buffer space available.
The result of line 71: No buffer space available.
The result of line 73: No buffer space available.
The result of line 79: No buffer space available.
The result of line 81: No buffer space available.
sysname#

Oddly I'm not too concerned about actually fixing the system -- I'm looking
for a way to detect the onset of the failure through semi-conventional
methods so I can build some sort of watchdog app on the system.

Is there a way to be able to track from userland whatever buffer is getting
exhausted?

Thanks,

peter