Re: fork performance

To: Masao Uebayashi <uebayasi%gmail.com@localhost>
Subject: Re: fork performance
From: Lars Heidieker <lars%heidieker.de@localhost>
Date: Mon, 22 Oct 2012 23:05:02 +0200

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/22/2012 02:27 AM, Masao Uebayashi wrote:
> On Thu, Oct 18, 2012 at 4:39 PM, Lars Heidieker <lars%heidieker.de@localhost>
> wrote:
>> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
>> 
>> On 10/18/2012 09:15 AM, David Laight wrote:
>>> On Thu, Oct 18, 2012 at 12:52:47PM +1300, Lloyd Parkes wrote:
>>>> So, with a slightly closer look, a guess and some tests to 
>>>> verify my guess, and I think I have found my performance
>>>> problem converting the NetBSD CVS repositories to Mercurial.
>>>> 
>>>> The CVS server forks once for each command it receives, and
>>>> it receives a lot of commands. NetBSD fork(2) seems to be
>>>> much slower than OS X fork(2).
>>> 
>>> I've seen things that show that a processes memory page list
>>> isn't getting its entries merged - so there are a lot of items
>>> to process during fork().  (cat something in /proc ...)
>>> 
>>> The malloc netbsd uses (that uses mmap() instead of sbrk()) 
>>> probably makes this much more significant. Especially if a big
>>> C++ program - like a python interpreter - is doing the
>>> forks().
>>> 
>>> David
>>> 
>> 
>> Hi,
>> 
>> currently the amap layer limits the size of amaps to 255 *
>> PAGE_SIZE see:
>> http://nxr.netbsd.org/xref/src/sys/uvm/uvm_amap.c#494
>> 
>> that's why the map entries for anon memory don't get merged.
> 
> What happens if larger page size is used?
> 

Hi,

I think we can make the check
(http://nxr.netbsd.org/xref/src/sys/uvm/uvm_amap.c#494):
if ((slotneed * sizeof(*newsl)) > PAGE_SIZE) {

this will give us 1024 slots so for 4k PAGE_SIZE we have 4mb reach.
This re-enables a code path in amap_copy that breaks up a large (large
then those 255 slot limit) into chunks, this code has been disabled
for some years (since rev 1.59 afaik) it does work stable so far on my
test machine with such raised limit in place.
(This is however a short term solution, just a step in the right
direction).

I think we need to cap those allocations in size at the largest kmem
cache we have, which currently is PAGE_SIZE.

In the longer term a proper solution might be to change the allocation
strategy of the amap to become like a two leveled page-table once we
get larger then PAGE_SIZE.

The limit was introduced 7 1/2 years ago in revision 1.59 during that
time-frame allocations were done via malloc(9), which had an hard
upper limit of 64k etc. and for some years it is changed to kmem(9)
So things have changed quite a bit since.

(yamt@ I have added you as you introduced the limit)

kind regards,
Lars

- -- 
- ------------------------------------

Mystische Erklärungen:
Die mystischen Erklärungen gelten für tief;
die Wahrheit ist, dass sie noch nicht einmal oberflächlich sind.

   -- Friedrich Nietzsche
   [ Die Fröhliche Wissenschaft Buch 3, 126 ]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://www.enigmail.net/

iEYEARECAAYFAlCFtPoACgkQcxuYqjT7GRZqXgCgkOVP9DrCSOUvjfnztCUjK9qw
m2cAn3IoWU0COIIBUDhnF70/tG9Y9tUM
=KJU9
-----END PGP SIGNATURE-----

Follow-Ups:
- Re: fork performance
  - From: YAMAMOTO Takashi

References:
- fork performance
  - From: Lloyd Parkes
- Re: fork performance
  - From: David Laight
- Re: fork performance
  - From: Lars Heidieker
- Re: fork performance
  - From: Masao Uebayashi

Prev by Date: Re: fork performance
Next by Date: Re: fork performance
Previous by Thread: Re: fork performance
Next by Thread: Re: fork performance
Indexes:

Home | Main Index | Thread Index | Old Index