tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: XIP (Rev. 2)

On Tue, Nov 09, 2010 at 05:35:38PM +0000, Eduardo Horvath wrote:
> On Wed, 10 Nov 2010, Masao Uebayashi wrote:
> > On Tue, Nov 09, 2010 at 03:18:37PM +0000, Eduardo Horvath wrote:
> > > There are two issues I see with the design and I don't understand how 
> > > they are addressed:
> > > 
> > > 1) On machines where the cache is responsible for handling ECC, how do 
> > > you 
> > > prevent a user from trying to mount a device XIP, causing a data error 
> > > and 
> > > a system crash?
> > 
> > Sorry, I don't understand this situation...  How does this differ
> > from user mapped RAM pages with ECC?
> Ok, I'll try to explain the hardware.
> In an ECC setup you have extra RAM bits to store the ECC data.  That data 
> is generated when data is writen to RAM and checked when it's read back 
> from RAM.  This is usually done in the memory controller so the extra data 
> is not stored in the cache.  The ECC domain is RAM.
> If your machine uses ECC in the cache, then the ECC information is 
> generated and checked when the data is inserted and removed from the 
> cache.  The ECC domain is not RAM but cache.  In this case if you try to 
> set the bit in the PTE to enable caching for an address that does not 
> provide ECC bits, such as a FLASH PROM, when the data enters the cache it 
> has no ECC infomation and the cache generates a fault.
> On these machines the cache can only be enabled for RAM.

OK, so we have to tell pmap that the device page doesn't support
ECC, right?  This is yet another example why we should allocate
per-physical segment (memory or I/O device) metadata, and pass that
to pmap(9).

Now we pass only physical address to pmap_enter().  pmap has no
idea if it's from memory or I/O device, so it looks up vm_page
arrays, or use some MD database, like x86's MTRR does.

I plan to pass not PA but page metadata to pmap_enter() so that
it gets more information about PA - if it's from RAM or device.
That code would look like:

                struct pmap *pmap,
                vaddr_t va,
                struct vm_physseg *phys,
                off_t offset,
                vm_prot_t prot,
                int flags);

Where you can get PA by (phys->start + offset), and you can also know
more by looking up vm_physseg.  pmap_enter() checks if the given PA
is RAM or I/O, decide cache-ability and others like ECC.  (Not sure
you can set cache and ECC-on-cache separately though.)

To do this, we have to:

- Allocate physical segment metadata in prior.  This is the new
  uvm_page_physload_device() is for.

- Pass that metadata from device driver to pmap_enter(), via fault

This is a future plan.  The current code doesn't care about this,
but I'm pretty sure I can address such an issue very nicely.

> > > 2) How will this work with mfs and memory disks where you really want to 
> > > use XIP always but the pages are standard, managed RAM?
> > 
> > This is a good question.  What you need to do is:
> > 
> > - Provide a block device interface (mount)
> > 
> > - Provide a vnode pager interface (page fault)
> > 
> > You'll allocate managed RAM pages in the memory disk driver, and
> > keep them.  When a file is accessed, fault handler asks vnode pager
> > to give relevant pages back to it.
> > 
> > My current code assumes XIP backend is always a contiguous MMIO
> > device.  Both physical address pages and metadata (vm_page) are
> > contiguous, we can look up matching vm_pages (genfs_getpages_xip).
> >
> > If you want to use managed RAM pages, you need to manage a collection
> > of vm_pages, presented as a range.  This is exactly what uvm_object
> > is for.  I think it's natural that device drivers own uvm_object, and
> > return their pages back to other subsystems, or "loan" pages to
> > other uvm_objects like vnode.  The problem is, the current I/O
> > subsystem and UVM are not integrated very well.
> > 
> > So, the answer is, you can't do that now, but it's a known problem.
> > 
> > (Extending uvm_object and using it everywhere is the way to go.)
> Hm.  Does this mean two separate XIP implementations are needed for I/O 
> devices and managd RAM?

It should not.  I'm trying best to avoid it, by honoring abstraction.

What it should be is, device drivers always return metadata, i.e,,
either managed RAM pages or (manageable) device pages.  Device
driver has freedom to map its own linear space to these pages.
Now they're simply return paddr_t back to pager (udv_fault) as a
cookie, by calling either bus_space_mmap(9) / bus_dmamem_mmap(9).

The problem is, again, the current UVM device pager design discards
metadata of pages.  Thus pmap has to find needed info by its own.

What we need first is abstraction of physical pages.

Home | Main Index | Thread Index | Old Index