tech-kern: Re: Increasing SHMMAXPGS

Subject: Re: Increasing SHMMAXPGS
To: None <tech-kern@netbsd.org>
From: Curt Sampson <cjs@cynic.net>
List: tech-kern
Date: 07/05/2002 16:28:24
So I just had a look at this, and read the old thread by this name on
tech-kern, and here's sort of a summary:

1. Removing segments whose creator has died/crashed/whatever should
be an option that defaults to "off" in all cases. Systems such as
PostgresSQL actually use information from old segment to help clean up
after a crash.

2. There's no reason for SHMMAXPGS at all under the new VM system; these
are just normal memory pages backed with an anonymous pager, and use up
resources in the same way.

3. We probably do want some sort of limit on how many shm pages a
process can allocate in total, for the same reason we want that limit
for anything else. I note (via experimentation) that mmap applies the
datasize limit to anonymous memory (i.e., backed by swap, not a file),
though not to file maps. So it seems reasonable, since shared memory is
basically exactly the same thing (except that other processes can attach
these "anonymous" regions), that we should apply the same limit.

4. The tricky part here is that shared memory persists after a process
is gone. If an administrator finds his box running out of memory, he
might go and kill some memory-hogging processes and discover that he's
still out of memory, and not think to do an ipcs to find out how much of
that is "dead" SysV shared memory.

The best way I can think of to deal with this is to copy FreeBSD's
kern.ipc.shmmax sysctl (which is a global maximum for sysv shared memory
segments), and set that to some reasonable value, say, half of RAM,
and also (if it's not there already) account segment creation towards
the process's datasize limit. That way admins who aren't shared memory
clueful will not have too much damage done to their systems, and those
who are can crank up the control (or set it to -1, meaning no limit) and
get all the shared memory they need. But this seems to me still slightly
kludgy, so I'm open to other suggestions.

5. Some programs, it appears to me, don't expected shared memory to be
pagable. A particular example would be database programs, which use
shared memory for buffering blocks from disk. Needless to say, the naive
user who cranks up his shared memory segments to be most of system RAM
in the hope of improving his Oracle buffering performance is in for a
bit of a surprise!

FreeBSD solves this by adding a kern.ipc.shm_use_phys sysctl which makes
all shm segments allocated from that point on non-pagable. I can't say
I particularly like this solution, but it works, and I can't think
of anything else better enough to justify not being compatable with
FreeBSD. Presumably it would account towards the RLIMIT_MEMLOCK limits.
I could take a look at what would be involved in implementing this, if
anyone is interested.

Anyway, thoughts? If I just went ahead and did some or all of this,
would anyone object?

Note that this doesn't deal with any of the limits related to semaphores
or any of that, but I don't really feel up to taking that on right now.

cjs
-- 
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC