Re: kmem-pool-uvm

To: lars%heidieker.de@localhost
Subject: Re: kmem-pool-uvm
From: yamt%mwd.biglobe.ne.jp@localhost (YAMAMOTO Takashi)
Date: Wed, 20 Apr 2011 01:22:52 +0000 (UTC)

hi,

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Hi,
> 
> On 04/14/11 09:05, YAMAMOTO Takashi wrote:
>> why do you want to make subr_kmem use uvm_km directly?
>> to simplify the code?
>> i don't want to see that change, unless there's a clear benefit.
>>
> The reason was to simplify the code, yes, and reduce redundancy
> because in the current implementation the vmem allocates PAGE_SIZE
> memory from the uvm_km backend for requests <= PAGE_SIZE not utilizing
> the vacache and more importantly vmem is essentially just taking the
> address allocations made by uvm_map.
> With the changes I see about 15% less kernel map entries.
>> let me explain some background. currently there are a number of
>> kernel_map related problems:
>>
>> A-1. vm_map_entry is unnecessarily large for KVA allocation purpose.
>>
>> A-2. kernel-map-entry-merging is there to solve A-1. but it introduced
>> the allocate-for-free problem. ie. to free memory, you might need to
>> split map-entries thus allocate some memory.
>>
>> A-3. to solve A-2, there is map-entry-reservation mechanism. it's
> complicated
>> and broken.
>>
>> B. kernel fault handling is complicated because it needs memory allocation
>> (eg. vm_anon) which needs some trick to avoid deadlock.
>>
>> C. KVA allocation is complicated because it needs memory allocation
>> (eg. vm_map_entry) which needs some trick to avoid deadlock.
>>
>> the most of the above can be solved by separating KVA allocation and
>> kernel fault handling. (except C, which will be merely moved to a
>> different place.)
>>
> A-1 with vmem_btag being slightly less then half the size of
> vm_map_entry...
> A-2 solves A1 but A-3 solves A2 with the pitfall of reintroducing a
> part of A1 as we still have less map entries in the map but we don't
> save memory as all the entries not in the map cached aside for
> potential merging.
> In this sense it seems broken to me and that it is complicated.
> Reducing the overall allocated map_entries will help here, as vacaches do.
> 
> C seems to be inevitable it's only a question where it happens...
> 
> B is a result of having pageable memory, which can fault and
> non-pageable memory in the same map, with the need to allocated
> non-pageable memory in the event of a page fault.
> 
>> i implemented subr_vmem so that eventually it can be used as the primary
>> KVA allocator. ie. when allocating from kernel_map, allocate KVA from
>> kernel_va_arena first and then, if and only if necessary, register it to
>> kernel_map for fault handling. it probably allows us to remove VACACHE
>> stuff, too. kmem_alloc will be backed by a vmem arena which is backed by
>> kernel_va_arena.
>>
> Originally I thought about two options with option one being what my
> patch does and two:
> 
> If vmem is made the primary kva allocator, we should carve out a
> kernel heap entirely controlled by vmem, probably one special
> vm_map_entry in the kernel_map that spans the heap or a submap that
> never has any map_entries.
> Essentially separating pageable and non-pageable memory allocations,
> this would allow for removing the vacaches in the kernel-maps as well
> as the map-entry-reservation mechanism.
> 
> Questions that follow:
> - - how to size it probably.....

is this about limiting total size for a particular allocation?

> - - this might be the kmem_map? or two heaps an interrupt safe one and
> one non interrupt safe?

becuase kernel_va_arena would be quantum cache disabled,
most users would use another arena stacked on it.
(like what we currently have as kmem_arena.)
interrupt-safe allocations can either use kernel_va_arena directly or
have another arena eg. kmem_arena_intrsafe.

> 
> I think having two "allocators" (vmem and the vm_map_(entries) itself)
> controlling the kernel_map isn't a good idea as both have to be in
> sync, at least every allocation that is made by vm_map_entries need to
> be made in vmem as well. There is no clear responsibility for either.

i agree that having two allocator for KVA is bad.
my idea is having just one. (kernel_va_arena)
no allocation would be made by vm_map_entries for kernel_map.
kernel_map is kept merely for fault handling.

essentially kva allocation would be:

        va = vmem_alloc(kernel_va_arena, ...);
        if (pageable)
                create kernel_map entry for the va
        else
                ...
        return va;

> 
> Option two is more challenging and will solve problems B and As while
> option one solves most of the As leaving B untouched.

sure, it's more challenging and involves more work.
(so it hasn't finished yet. :-)

YAMAMOTO Takashi

> 
> Lars
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk2oFQUACgkQcxuYqjT7GRb4eACgt0ra+vpcQx8UTOivOgZcpsQe
> Nl0AoK8YnsJoYS5wdaSidLLB0OifWqeI
> =xLwx
> -----END PGP SIGNATURE-----

Follow-Ups:
- Re: kmem-pool-uvm
  - From: Mindaugas Rasiukevicius
- Re: kmem-pool-uvm
  - From: Lars Heidieker

References:
- Re: kmem-pool-uvm
  - From: Lars Heidieker

Prev by Date: Re: ffs fsync patch - block devices and wapbl
Next by Date: Re: GNU vs C99 extern inline
Previous by Thread: Re: kmem-pool-uvm
Next by Thread: Re: kmem-pool-uvm
Indexes:

Home | Main Index | Thread Index | Old Index