Re: making kmem more efficient

To: Eduardo Horvath <eeh%NetBSD.org@localhost>
Subject: Re: making kmem more efficient
From: Lars Heidieker <lars%heidieker.de@localhost>
Date: Thu, 01 Mar 2012 23:10:51 +0100
On 03/01/2012 07:22 PM, Eduardo Horvath wrote:
> On Thu, 1 Mar 2012, Lars Heidieker wrote:
> 
>> On 03/01/2012 06:04 PM, Eduardo Horvath wrote:
>>> On Thu, 1 Mar 2012, Lars Heidieker wrote:
>>>
>>>> Hi,
>>>>
>>>> this splits the lookup table into two parts, for smaller 
>>>> allocations and larger ones this has the following advantages:
>>>>
>>>> - smaller lookup tables (less cache line pollution) - makes large 
>>>> kmem caches possible currently up to min(16384, 4*PAGE_SIZE) - 
>>>> smaller caches allocate from larger pool-pages if that reduces the 
>>>> wastage
>>>>
>>>> any objections?
>>>
>>> Why would you want to go larger than PAGE_SIZE?  At that point 
>>> wouldn't you just want to allocate individual pages and map them
>>> into the VM space?
>>>
>>> Eduardo
>>>
>>
>> Allocations larger then PAGE_SIZE are infrequent (at the moment) that's
>> true.
>> Supporting larger pool-pages makes some caches more efficient eg 320/384
>> and some size possible like a byte 3072.
> 
> How does it make this more efficient?  And why would you want to have a 
> 3KB pool?  How many 3KB allocations are made?
> 

Due to choosing another pool-allocator "pagesize" in a way that more
allocations can be made from the same amount of memory.

eg from the stats below there are about 170 allocation that go to the 3k
pool and 640 to the 4k pool.
The 3k pool is backed by 12k pool pages so the entire 3k allocations are
512k if they went to 4k pools they would have taken 680k

>> Having these allocators in place the larger then PAGE_SIZE caches are a
>> trivial extension.
> 
> That's not really the issue.  It's easy to increase the kernel code size.  
> The question is whether the increase in complexity and code size is offset 
> by a commensurate performance improvement.
> 

Well in terms of code size there is an increase of about 500 byte
(amd64) and a decrease of nearly 3k of data due to smaller tables.

This splitting of lookup tables and large caches are orthogonal.

>> All caches multiplies of PAGE_SIZE come for free they don't introduce
>> any additional memory overhead in terms of footprint, on memory pressure
>> they can always be freed from the cache and with having them in place
> 
> But you do have the overhead of the pool itself.
> 
>> you save the TLB shoot-downs because of mapping and un-mapping them and
>> the allocation deallocation of the page frames.
>> So they are more then a magnitude faster.
> 
> Bold claims.  Do you have numbers that show the performance improvement?
> 

I changed the allocfree test in /usr/src/regress/sys/kern/allocfree to
do only the kmem test and run it on current and current + changes

with caches up to 16k with a clear cut off once leaving the caches.

=> using nanotime() for timings
SIZE    NCPU    KMEM
4096    1       62
4096    2       62
4096    3       62
4096    4       62

=> using nanotime() for timings
SIZE    NCPU    KMEM
8192    1       62
8192    2       62
8192    3       62
8192    4       62

=> using nanotime() for timings
SIZE    NCPU    KMEM
16384   1       63
16384   2       62
16384   3       61
16384   4       61

=> using nanotime() for timings
SIZE    NCPU    KMEM
32768   1       2176
32768   2       4633
32768   3       7292
32768   4       10089

without those caches the cut off is after 4096

=> using nanotime() for timings
SIZE    NCPU    KMEM
4096    1       62
4096    2       60
4096    3       61
4096    4       61

=> using nanotime() for timings
SIZE    NCPU    KMEM
8192    1       1960
8192    2       2334
8192    3       3076
8192    4       3953

=> using nanotime() for timings
SIZE    NCPU    KMEM
16384   1       2211
16384   2       3099
16384   3       4283
16384   4       6101

=> using nanotime() for timings
SIZE    NCPU    KMEM
32768   1       2748
32768   2       4661
32768   3       7079
32768   4       9923




>>
>> Just some stats of a system (not up long) with those changes:
>> collected with "vmstat -mvWC"
>>
>>> kmem-1024         1024        5789    0        5158    631    73    584    
>>> 408   176   4096   584     0   inf    3 0x800  89.6%
>>> kmem-112           112        1755    0         847    908    28     27     
>>>  1    26   4096    26     0   inf    0 0x800  95.5%
> 
> Interesting numbers.  What exactly do they mean?  Column headers would 
> help decypher them.
> 

They were collected by vmstat -mvWC
Keep in mind that the request and release numbers don't reflect the
kmem_alloc/kmem_free calls as a cache layer is above those pools.

> Memory resource pool statistics
> Name              Size    Requests Fail    Releases  InUse Avail  Pgreq  
> Pgrel Npage PageSz Hiwat Minpg Maxpg Idle Flags   Util
> kmem-1024         1024       15095    0       13264   1831     1   1401    
> 943   458   4096   627     0   inf    0 0x800  99.9%
> kmem-112           112        3524    0        2624    900    36     29      
> 3    26   4096    27     0   inf    0 0x800  94.7%
> kmem-12288       12288          70    0           0     70     0     70      
> 0    70  12288    70     0   inf    0 0x800 100.0%
> kmem-128           128        8234    0        7022   1212   132     55     
> 13    42   4096    49     0   inf    1 0x800  90.2%
> kmem-16             16        5781    0        4628   1153   127      5      
> 0     5   4096     5     0   inf    0 0xc00  90.1%
> kmem-160           160        1930    0        1598    332    18     21      
> 7    14   4096    14     0   inf    0 0x800  92.6%
> kmem-16384       16384          33    0          26      7    12     33     
> 14    19  16384    19     0   inf   12 0x800  36.8%
> kmem-192           192        1559    0        1400    159     9     17      
> 9     8   4096     8     0   inf    0 0x800  93.2%
> kmem-2048         2048        8135    0        7165    970     0   1706   
> 1221   485   4096   629     0   inf    0 0x800 100.0%
> kmem-224           224        1397    0        1312     85     5     13      
> 8     5   4096     6     0   inf    0 0x800  93.0%
> kmem-24             24        2167    0        1925    242    98      2      
> 0     2   4096     2     0   inf    0 0xc00  70.9%
> kmem-256           256        1220    0        1148     72    24     14      
> 8     6   4096     6     0   inf    1 0x800  75.0%
> kmem-3072         3072         173    0         144     29     3     11      
> 3     8  12288     8     0   inf    0 0x800  90.6%
> kmem-32             32        3930    0        3406    524   116      5      
> 0     5   4096     5     0   inf    0 0xc00  81.9%
> kmem-320           320        1861    0        1346    515    46     13      
> 2    11  16384    11     0   inf    0 0x800  91.4%
> kmem-384           384        1537    0        1228    309    11     12      
> 2    10  12288    11     0   inf    0 0x800  96.6%
> kmem-40             40        7695    0        7282    413   301     47     
> 40     7   4096    47     0   inf    0 0xc00  57.6%
> kmem-4096         4096         666    0         621     45     0    161    
> 116    45   4096    77     0   inf    0 0x800 100.0%
> kmem-448           448        2207    0        2062    145     8     33     
> 16    17   4096    18     0   inf    0 0x800  93.3%
> kmem-48             48        2980    0        2003    977    43     12      
> 0    12   4096    12     0   inf    0 0xc00  95.4%
> kmem-512           512        1098    0        1034     64     0     26     
> 18     8   4096    11     0   inf    0 0x800 100.0%
> kmem-56             56        3280    0        2938    342    23      6      
> 1     5   4096     6     0   inf    0 0xc00  93.5%
> kmem-6144         6144          11    0           0     11     1      6      
> 0     6  12288     6     0   inf    0 0x800  91.7%
> kmem-64             64       14076    0       12210   1866   374     44      
> 9    35   4096    39     0   inf    0 0x800  83.3%
> kmem-768           768        2676    0        2414    262    10     26      
> 9    17  12288    18     0   inf    0 0x800  96.3%
> kmem-8               8        5497    0        4572    925    99      3      
> 1     2   4096     3     0   inf    0 0xc00  90.3%
> kmem-80             80        4528    0        2498   2030    61     41      
> 0    41   4096    41     0   inf    0 0x800  96.7%
> kmem-8192         8192          20    0           2     18     0     20      
> 2    18   8192    20     0   inf    0 0x800 100.0%
> kmem-96             96        1993    0        1802    191    19      9      
> 4     5   4096     5     0   inf    0 0x800  89.5%
> 
> In use 582889K, total allocated 647504K; utilization 90.0%

> Pool cache statistics.
> Name          Spin GrpSz Full Emty PoolLayer CacheLayer  Hit%    CpuLayer  
> Hit%
> kmem-1024        6    15    1  100     15095     186772  91.9    16169927  
> 98.8
> kmem-112         0    15    0    5      3524       9220  61.8     4030143  
> 99.8
> kmem-12288       0    15    0    0        70         71   1.4          73   
> 2.7
> kmem-128         9    15    3   46      8234     109619  92.5    15181413  
> 99.3
> kmem-16          2    15   16   11      5781     183275  96.8    24750264  
> 99.3
> kmem-160         0    15    0    2      1930       2251  14.3     1234841  
> 99.8
> kmem-16384       0    15    0    0        33         43  23.3         149  
> 71.1
> kmem-192         0    15    0    1      1559       1914  18.5      689746  
> 99.7
> kmem-2048        0    15    0   50      8135      51194  84.1    12149279  
> 99.6
> kmem-224         0    15    0    1      1397       1619  13.7      777989  
> 99.8
> kmem-24          0    15    1    2      2167       3107  30.3   236624170 
> 100.0
> kmem-256         0    15    0    1      1220       1539  20.7      478705  
> 99.7
> kmem-3072        0    15    0    0       173        318  45.6     1998896 
> 100.0
> kmem-32          0    15    7    5      3930      36806  89.3    16574380  
> 99.8
> kmem-320         0    15    0    1      1861       2124  12.4     3647732  
> 99.9
> kmem-384         0    15    0    0      1537       1761  12.7      683765  
> 99.7
> kmem-40          0    15    5    5      7695      23743  67.6    10202262  
> 99.8
> kmem-4096        0    15    0    0       666        837  20.4     8356242 
> 100.0
> kmem-448         0    15    2    3      2207       9816  77.5     8100626  
> 99.9
> kmem-48          0    15    0    2      2980       4016  25.8     1117971  
> 99.6
> kmem-512         0    15    0    0      1098       1283  14.4      434717  
> 99.7
> kmem-56          0    15    4    5      3280      22229  85.2     6312997  
> 99.6
> kmem-6144        0    15    0    0        11         12   8.3          16  
> 25.0
> kmem-64          6    15   10   93     14076     483922  97.1    30823520  
> 98.4
> kmem-768         1    15    0    4      2676       6056  55.8     2189849  
> 99.7
> kmem-8           6    15   11   12      5497     247578  97.8    90875483  
> 99.7
> kmem-80          0    15    1    4      4528       8552  47.1     4691866  
> 99.8
> kmem-8192        0    15    0    0        20         22   9.1          32  
> 31.2
> kmem-96          0    15    1    1      1993       2897  31.2     1019786  
> 99.7

Lars
References:
- making kmem more efficient
  - From: Lars Heidieker
- Re: making kmem more efficient
  - From: Eduardo Horvath
- Re: making kmem more efficient
  - From: Lars Heidieker
- Re: making kmem more efficient
  - From: Eduardo Horvath
Prev by Date: Re: making kmem more efficient
Next by Date: Re: 6.0_BETA: Extreamly slow newfs_ext2fs on 60Gb USB stick
Previous by Thread: Re: making kmem more efficient
Next by Thread: 6.0_BETA: Extreamly slow newfs_ext2fs on 60Gb USB stick
Indexes:
Home | Main Index | Thread Index | Old Index