[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: pmap_extract(9) (was Re: xmd(4) (Re: XIP))
On Thu, Nov 11, 2010 at 12:48:53AM +0900, Masao Uebayashi wrote:
> On Mon, Nov 08, 2010 at 08:53:12AM -0800, Matt Thomas wrote:
> > On Nov 8, 2010, at 8:07 AM, Masao Uebayashi wrote:
> > > On Mon, Nov 08, 2010 at 10:48:45AM -0500, Thor Lancelot Simon wrote:
> > >> On Mon, Nov 08, 2010 at 11:32:34PM +0900, Masao Uebayashi wrote:
> > >>>
> > >>> I don't like "it's MD, period" attitude. That solves nothing.
> > >>
> > >> We've had pmaps which have tried to pretend they were pmaps for some
> > >> other architecture (that is, that some parts of the pmap weren't
> > >> best left MD). For example, we used to have a lot of pmaps in our
> > >> tree that sort of treated the whole world like a 68K MMU.
> > >>
> > >> Performance has not been so great. And besides, what -are- you going
> > >> to do, in an MI way, about synchronization against hardware lookup?
> > >
> > > Do you mean synchronization among processors?
> > No. For instance, on PPC OEA processors the CPU will write back to
> > the reverse page table entries to update the REF/MOD bits. This
> > requires the pmap to use the PPC equivalent of LL/SC to update PTEs.
> > For normal page tables with hardware lookup like ARM the MMU will
> > read the L1 page table to find the address of the L2 page tables
> > and then read the actual PTE. All of this happens without any sort
> > of locking so updates need to be done in a lockless manner to have
> > a coherent view of the page tables.
> > On a TLB base MMU, the TLB miss handler will run without locking
> > which requires an always coherent page lookup (typically page table)
> > where entries (either PTEs or page table pointers) are updated using
> > using lockless primitives (CAS). THis is even more critical as we
> > deal with more MP platform where lookups on one CPU may be happening
> > in parallel with updates on another.
> So, in either design, we have to carefully update page tables by
> atomic operations.
> But even with it done so, the whole fault resolution can be done
> in once shot in slow paths - like paging (I/O) or COW. There are
> consistencies between VAs sharing one PA, or CPUs sharing one VA.
> And we resolve these dirty works one by one. My concern is more
> about the order of those operations.
Main Index |
Thread Index |