Re: Trying to understand mbuf-related sysctl parameters

To: Tom Ivar Helbekkmo <tih%hamartun.priv.no@localhost>
Subject: Re: Trying to understand mbuf-related sysctl parameters
From: Greg Troxel <gdt%ir.bbn.com@localhost>
Date: Thu, 06 Nov 2014 08:52:16 -0500

Tom Ivar Helbekkmo <tih%hamartun.priv.no@localhost> writes:

> [mbuf/socketbuf sysctls]

> However, I can't find anything that really tells me what the
> relationship between the first five of these is.  Does it make sense to
> have sbmax greater than the sum of sendbuf_max and recvbuf_max?  Should
> somaxkva rather be nmbclusters * mclbytes (536870912, or four times what
> it is now)?

Good questions, and I can only help somewhat.  I'm going to reorder your
sysctl list...

> kern.mbuf.nmbclusters=262144

nmbclusters is the maximum number of clusters the system will allocate.
You have a very big number (seems like 0.5G of clusers), in my
experience (but with that much RAM, why not).  Clusters are attached to
mbufs to use istead of the very small (< 256 bytes) internal space.
Many drivers always put received packets in clusters, because they have
pre-set-up DMA in a receive chain.  Some drivers (bnx) pre-allocate 512
clusters for receiving, and this e.g. puts pressure on cluster
allocation in systems with 8 internfaces.

> kern.sbmax=4194304

I suspect kern.sbmax is a hard limit on how big any socket buffer can
be, even if programs do sockopts to set it.  In addition to the limits
below, I am pretty sure programs can change the buffer sizes, but didn't
find the man page in a few seconds.  But, almost all buffer size
tweaking is done by sysctls.

> kern.somaxkva=134217728

I don't know; I'd go reading sources to figure it out.  That looks like
128G, which I can't immediately map to anything that makes sense.

> net.inet.tcp.sendspace=262144
> net.inet.tcp.recvspace=262144
> net.inet.udp.sendspace=262144
> net.inet.udp.recvspace=262144

These are the default limits on socket buffer sizes.   You will need to
up the tcp ones if you are getting slow transfers due to high bw*delay.

You left out

 net.inet.tcp.recvbuf_auto = 1
 net.inet.tcp.sendbuf_auto = 1

which let the buffers autoscale, basically upping them when they seem to
be getting full.  This mostly works well; I am running with it and not
having pain.  But you may find that it's slower to ramp up the buffer
size than you want.

> net.inet.tcp.sendbuf_max=1048576
> net.inet.tcp.recvbuf_max=1048576

These set the limit on what the autoscaling algorithm will increase to.

sendspace/recvspace are only relevant for connections to/from the
machine itself, and only tend to matter for relatively distant high
speed transfers.  The symptom is slow TCP transfer rates, but otherwise
it's hard to notice.

machines that function as gateways, have lots of interfaces, or lots of
open connections tend to need a lot of clusters.   If you haven't run
out you have enough, more or less.

Attachment: pgpz5_LVne15k.pgp
Description: PGP signature

Follow-Ups:
- Re: Trying to understand mbuf-related sysctl parameters
  - From: Tom Ivar Helbekkmo

References:
- Trying to understand mbuf-related sysctl parameters
  - From: Tom Ivar Helbekkmo

Prev by Date: Re: raspberry pi panic 7.0_BETA after install fs resize
Next by Date: Re: raspberry pi panic 7.0_BETA after install fs resize
Previous by Thread: Trying to understand mbuf-related sysctl parameters
Next by Thread: Re: Trying to understand mbuf-related sysctl parameters
Indexes:

Home | Main Index | Thread Index | Old Index