tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: [Milkymist port] virtual memory management



Le 10/02/14 18:10, Eduardo Horvath a écrit :
On Sun, 9 Feb 2014, Yann Sionneau wrote:

Thank you for your answer Matt,

Le 09/02/14 19:49, Matt Thomas a écrit :
On Feb 9, 2014, at 10:07 AM, Yann Sionneau <yann.sionneau%gmail.com@localhost> 
wrote:

Since the kernel runs with MMU on, using virtual addresses, it cannot
dereference physical pointers then it cannot add/modify/remove PTEs,
right?
Wrong.  See above.
You mean that the TLB contains entries which map a physical address to itself?
like 0xabcd.0000 is mapped to 0xabcd.0000? Or you mean all RAM is always
mapped but to the (0xa000.000+physical_pframe) kind of virtual address you
mention later in your reply?
What I did for BookE is reserve the low half of the kernel address space
for VA=PA mappings.  The kernel resides in the high half of the address
space.  I did this because the existing PPC port did strange things with
BAT registers to access physical memory and copyin/copyout operations and
I couldn't come up with a better way to do something compatible with the
BookE MMU.  It did limit the machine to 2GB RAM, which wasn't a problem
for the 405GP.

Also, the user address space is not shared with the kernel address space
as on most machines.  Instead, user processes get access to their own 4GB
address space, and the kernel has 2GB to play with when you deduct the 2GB
VA==PA region.  (It's the same sort of thing I did for sparc64 way back
when it was running 32-bit userland.  But it doesn't need VA==PA mappings
and can access physical and userland addresses while the kernel address
space is active.  Much nicer design.)

When a BookE machine takes an MMU miss fault, the fault handler examines
the faulting address if the high bit is zero, it synthesizes a TLB entry
where the physical address is the same as the virtual address.  If the
high bit is set, it walks the page tables to find the TLB entry.

This did make the copyin/copyout operations a bit complicated since it
requires flipping the MMU between two contexts while doing the copy
operation.

Also, is it possible to make sure that everything (in kernel space) is
mapped so that virtual_addr = physical_addr - RAM_START_ADDR +
virtual_offset
In my case RAM_START_ADDR is 0x40000000 and I am trying to use
virtual_offset of 0xc0000000 (everything in my kernel ELF binary is mapped
at virtual address starting at 0xc0000000)
If I can ensure that this formula is always correct I can then use a very
simple macro to translate "statically" a physical address to a virtual
address.
Not knowing how much ram you have, I can only speak in generalities.
I have 128 MB of RAM.
But in general you reserve a part of the address space for direct mapped
memory and then place the kernel about that.

For instance, you might have 512MB of RAM which you map at 0xa000.0000
and then have the kernel's mapped va space start at 0xc000.0000.
So if I understand correctly, the first page of physical ram (0x4000.0000) is
mapped at virtual address 0xa000.0000 *and* at 0xc000.0000 ?
Isn't it a problem that a physical address is mapped twice in the same process
(here the kernel)?
My caches are VIPT, couldn't it generate cache aliases issues?
If the MMU is always on while the kernel is running, and covers all of the
KVA, then you could relocate the kernel text and data segments wherever
you want them to be.  If you want to put the kernel text and data segments
in the direct-mapped range, you can easily do that.  If you want it
elsewhere, that should work too.
So if I understand correctly I could implement the following scheme:

Let my linker put the kernel ELF virtual addresses to 0xc000.0000. Load the kernel at base of RAM (0x4000.0000) Then reserve this memory region as a "window" over physical ram : 0xc000.0000-0xc800.0000 (ram size is 128 MB) by doing something like "physseg[0].avail_start = 0xc800.0000;" in pmap_bootstrap()

Then in my tlb miss handlers I could do:

if (fault happened in kernel mode) /* we don't want user space to access all ram through the kernel window */
{
if ( (miss_vaddr < 0xc8000000) && (miss_vaddr >= 0xc0000000) ) /* <= this would be kind of like your test of the high bit of the faulty vaddr */
    {
reload_tlb_with(atop(miss_vaddr), atop(miss_vaddr - 0xc0000000 + 0x40000000) | some_flags); /* <= create the mapping for accessing the window */
        return_from_exception;
    }
} else {
    - access the page table to reload tlb
    - page table contains only physical addresses
- but I can dereference those using the 0xc000.0000-0xc800.0000 window knowing that a nested tlb miss can happen
}

Does this sound reasonable ?


The cache aliasing issues in VIPT caches only occur if the cache way size
is larger than the page size.  If you're designing your own hardware,
don't do that.  Otherwise, remember to only access a page through a single
mapping and you won't have aliasing issues.  And flush the page from the
cache wenever establishing a new mapping.
Well, lm32 caches are configurable but for the Milkymist SoC they are not configured too big such that there is no alias problem. In order to handle all cases of lm32 cache sizes I guess I need to add a macro that the machine headers will define if there are cache alias issues possible. But then if I am using the 0xc000.0000-0xc800.0000 window during my tlb miss handler I guess I will have no choice but to invalidate the cache because any fault while reading this window will then add a tlb entry to this window which would possibly cause a physical page to be mapped twice and then could cause alias issues (in the scenario where caches are too big).

Eduardo
Thanks for all your explanations, if everything I said here is correct (which would mean I understood correctly your answer) then I think I'm ready to implement all this :)

Best regards,

--
Yann Sionneau


Home | Main Index | Thread Index | Old Index