Subject: Re: Add a MAP_ALIGNED flag for mmap(2).
To: Matt Thomas <matt@3am-software.com>
From: Andrew Brown <atatat@atatdot.net>
List: tech-kern
Date: 03/01/2003 00:39:12
>>  > Currently, ld.elf_so doesn't honor the alignment specified in a
>>  > ELF file's psections.  This is due to the lack of ability to request
>>  > an aligned block of addresses from mmap(2).  I propose we add a
>>  > MAP_ALIGNED flag which would mean that the addr argument to mmap(2)
>>  > would be the required alignment of the block.  Supplying both
>>  > MAP_ALIGNED|MAP_FIXED would cause an error EINVAL to be returned.
>>  >
>>  > Any thought on this proposal?
>>
>>Sounds good.  I'm a little concerned that it means you can't pass a
>>"please map it in this general neighborhood of the address space", but
>>I definitely want the ability to specify alignment.
>
>Define general area.  1MB? 16MB? 128MB?

the hint is lost, so you can't say "in the middle of the address
space".  you get only the closest address to the default that's
available.

>>One question -- what happens if mmap can't satisfy the alignment
>>contraint?  Is it just a hint or a strict requirement?
>
>I'd say it's a strict requirement like MAP_FIXED it.

yeah.

one thought: i note that the flags argument to mmap() only uses about
six bits (i dug around a bit, but i didn't see anywhere that used the
other bit positions for other stuff).  given that the requested
alignment would (a) presumably be related to pages and (b) would also
probably be in terms of a power of two thereof, what about stuffing
MAP_ALIGN into there, along with the bits from
log2(alignment>>PGSHIFT)?

the 64 bit address space of, eg, the sparc64 address space, when
combines with the 13 bit page shift (8k pages) leaves 41 bits.  to
count to 41, you need only 6 bits, so that would be a total of seven
bits stuffed into flags.  there's space for it, and you don't lose the
hint value.

that said, you could allow both MAP_FIXED and MAP_ALIGN, and refuse it
based on the hint not aligning properly, as opposed to not allowing
both.

two cents.

-- 
|-----< "CODE WARRIOR" >-----|
codewarrior@daemon.org             * "ah!  i see you have the internet
twofsonet@graffiti.com (Andrew Brown)                that goes *ping*!"
werdna@squooshy.com       * "information is power -- share the wealth."