Port-amd64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Some questions about x86/xen pmap



Andrew Doran wrote:
Hi,

Sorry for taking so long to reply.

No problem, I was myself out of time lately ;) Thanks for your reply.

Just to set the background, to make Xen save/restore/migrate work under a NetBSD domU, I have to do some slight modifications to the pmap, for two reasons: - Xen (and xentools) do not handle alternative recursive mappings (the APDP entries in our PD).

I' planning to elminate these for native x86. Instead of using the alternate
space, native x86 will temporarily switch the pmap onto the CPU. I discussed
this briefly with Manuel and he suggested that there may be problems with
this approach for xen. My patch leaves it unchanged for xen.

Alright, then I am adding Manuel to the loop, as I would like to know why removing the APDP would be a problem comparing to temporarily switching pmaps. Is it a performance or a design issue, NetBSD-specific or anything else?

using a MFN, the thread has to acquire a reader lock, and release it when it has finished with it. The suspension thread acquires a writer lock, so that every other thread that wants to manipulate MFNs have to wait for the suspension to release the writer, which is done during resume when it is deemed safe.

I think I have seen a patch where you did this?

It is (somewhat) found in my branch, jym-xensuspend, where we can see how I am using rwlock(9) to protect these parts.

It will change though, as it depends on the APDP issue, and on the way the x86 pmap is structured. I'd like to get the locking right before having to re-design it due to modifications that could affect suspend/resume in unexpected places.

There used to be a global RW
lock in the pmap and we worked quite hard to get rid of it. Global RW locks
are bad news. Even if there is no contention on the lock itself, they cause
lots of cache line ping-ponging between CPUs and lots of writebacks to main
memory. For heavyweight stuff that's OK but the pmap really gets hammered
on at runtime. As an alternative I would suggest locking all existing pmaps
and also prevent new pmaps from being created while your code runs. Would
that work for you?

Looks like it would. I will have a look.

- in pmap_unmap_ptes() (used to unlock the pmap used earlier by pmap_map_ptes()), the APDP is only unmapped when MULTIPROCESSOR is defined. While it is now the case for GENERIC, we do not have SMP support inside port-xen currently. Why is it necessary to empty the entry for MP and not for UP?

From memory this is a NULL assignment? I don't know.

My point was to move the APDP unmap code outside of the MULTIPROCESSOR #define. From my current understanding of the pmap, APDP entries are also used with UP, but we do not need the global TLB flush to keep it in sync with other CPUs in case of MP.

This along with the pmap_pte_flush() call, which is a NOOP under traditional x86 (under Xen, it is used to batch updated to the queue used for MMU operations).

The initial goal was to expand my locking to include protection for the pmap_map_ptes/pmap_unmap_ptes calls, so that APDP entries are cleared by pmap_unmap_ptes(). That way, I could avoid the pmap iterations during save, as I could guarantee that they all APDP entries are left unmapped.

- what would be the fastest way to iterate through pmaps? My current code is inspired from the one found in pmap_extract(), but it requires to map/unmap each pmap to check for non-0 APDP slots. As they are not that common, this may be costly when the pmap list is lengthy. I tried to access the PD of each pmap through their pm_pdir member, but it usually ends badly with a fault exception in some cases.

I believe that there is a global list of pmaps. Have a look at
pmap_growkernel().

Correct, thanks :)

- rwlock(9) clearly states that callers must not /recursively/ acquire read locks. Is a thread allowed to acquire a reader lock multiple times, provided it is not done recursively?

In short: no. You can do it with rw_tryenter(), but note that rw_tryenter()
can't be called in a loop. Why: even if you hold the lock read held, it may
be "closed" for new read holds, because a writer is waiting on the lock and
the RW lock code has decided that the writer should get priority. It's to
prevent "starvation".

Good to know. I was not sure about the ownership + counter use of our rwlock(9) implementation.

--
Jean-Yves Migeon
jeanyves.migeon%free.fr@localhost



Home | Main Index | Thread Index | Old Index