tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: CVS commit: src/sys/uvm



On Jan 3, 2011, at 9:01 PM, Masao Uebayashi wrote:

> I take silence as "no objection".

Not so safe.

> On Thu, Dec 23, 2010 at 12:48:04PM +0900, Masao Uebayashi wrote:
>> On Wed, Dec 22, 2010 at 05:37:57AM +0000, YAMAMOTO Takashi wrote:
>>> hi,
>>> 
>>>> Could you ack this discussion?
>>> 
>>> sorry for dropping a ball.
>>> 
>>>> 
>>>> On Tue, Dec 07, 2010 at 01:19:46AM +0900, Masao Uebayashi wrote:
>>>>> On Thu, Nov 25, 2010 at 11:32:39PM +0000, YAMAMOTO Takashi wrote:
>>>>>> [ adding cc: tech-kern@ ]
>>>>>> 
>>>>>> hi,
>>>>>> 
>>>>>>> On Wed, Nov 24, 2010 at 11:26:39PM -0800, Matt Thomas wrote:
>>>>>>>> 
>>>>>>>> On Nov 24, 2010, at 10:47 PM, Masao Uebayashi wrote:
>>>>>>>> 
>>>>>>>>> On Thu, Nov 25, 2010 at 05:44:21AM +0000, YAMAMOTO Takashi wrote:
>>>>>>>>>> hi,
>>>>>>>>>> 
>>>>>>>>>>> On Thu, Nov 25, 2010 at 04:18:25AM +0000, YAMAMOTO Takashi wrote:
>>>>>>>>>>>> hi,
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi, thanks for review.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Nov 25, 2010 at 01:58:04AM +0000, YAMAMOTO Takashi wrote:
>>>>>>>>>>>>>> hi,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - what's VM_PHYSSEG_OP_PG?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> It's to lookup vm_physseg by "struct vm_page *", relying on that
>>>>>>>>>>>>> "struct vm_page *[]" is allocated linearly.  It'll be used to 
>>>>>>>>>>>>> remove
>>>>>>>>>>>>> vm_page::phys_addr as we talked some time ago.
>>>>>>>>>>>> 
>>>>>>>>>>>> i'm not sure if commiting this unused uncommented code now helps 
>>>>>>>>>>>> it.
>>>>>>>>>>>> some try-and-benchmark cycles might be necessary given that
>>>>>>>>>>>> vm_page <-> paddr conversion could be performace critical.
>>>>>>>>>>> 
>>>>>>>>>>> If you really care performance, we can directly pass "struct vm_page
>>>>>>>>>>> *" to pmap_enter().
>>>>>>>>>>> 
>>>>>>>>>>> We're doing "struct vm_page *" -> "paddr_t" just before 
>>>>>>>>>>> pmap_enter(),
>>>>>>>>>>> then doing "paddr_t" -> "vm_physseg" reverse lookup again in
>>>>>>>>>>> pmap_enter() to check if a given PA is managed.  What is really
>>>>>>>>>>> needed here is, to lookup "struct vm_page *" -> "vm_physseg" once
>>>>>>>>>>> and you'll know both paddr_t and managed or not.
>>>>>>>>>> 
>>>>>>>>>> i agree that the current code is not ideal in that respect.
>>>>>>>>>> otoh, i'm not sure if passing vm_physseg around is a good idea.
>>>>>>>>> 
>>>>>>>>> It's great you share the interest.
>>>>>>>>> 
>>>>>>>>> I chose vm_physseg, because it was there.  I'm open to alternatives,
>>>>>>>>> but I don't think you have many options...
>>>>>>>> 
>>>>>>>> Passing vm_page * doesn't work if the page isn't managed since there
>>>>>>>> won't be a vm_page for the paddr_t.
>>>>>>>> 
>>>>>>>> Now passing both paddr_t and vm_page * would work and if the pointer
>>>>>>>> to the vm_page, it would be an unmanaged mapping.  This also gives the
>>>>>>>> access to mdpg without another lookup.
>>>>>>> 
>>>>>>> What if XIP'ed md(4), where physical pages are in .data (or .rodata)?
>>>>>>> 
>>>>>>> And don't forget that you're the one who first pointed out that
>>>>>>> allocating vm_pages for XIP is a pure waste of memory. ;)
>>>>>> 
>>>>>> i guess matt meant "if the pointer to the vm_page is NULL,".
>>>>>> 
>>>>>>> 
>>>>>>> I'm allocating vm_pages, only because of phys_addr and loan_count.
>>>>>>> I believe vm_pages is unnecessary for read-only XIP segments.
>>>>>>> Because they're read-only, and stateless.
>>>>>>> 
>>>>>>> I've already concluded that the current "managed or not" model
>>>>>>> doesn't work for XIP.  I'm pretty sure that my vm_physseg + off_t
>>>>>>> model can explain everything.  I'm rather waiting for a counter
>>>>>>> example how vm_physseg doesn't work...
>>>>>> 
>>>>>> i guess your suggestion is too vague.
>>>>>> where do you want to use vm_physseg * + off_t instead of vm_page * ?
>>>>>> getpages, pmap_enter, and?  how their function prototypes would be?
>>>>> 
>>>>> The basic idea is straightforward; always allocate vm_physseg for
>>>>> memories/devices.  If a vm_physseg is used as general purpose
>>>>> memory, you allocate vm_page[] (as vm_physseg::pgs).  If it's
>>>>> potentially mapped as cached, you allocate pvh (as vm_physseg:pvh).
>>> 
>>> can you provide function prototypes?
>> 
>> I have no real code for this big picture at this moment.  Making
>> vm_physseg available as reference is the first step.  This only
>> changes uvm_page_physload() to return a pointer:
>> 
>>      -void uvm_page_physload();
>>      +void *uvm_page_physload();
>> 
>> But this makes XIP pager MUCH cleaner.  The reason has been explained
>> many times.
>> 
>> Making fault handlers and pagers to use vm_physseg * + off_t is
>> the next step, and I don't intend to work on it now.  I just want
>> to explain the big picture.
>> 
>>> 
>>>>> 
>>>>> Keep vm_physseg * + off_t array on stack.  If UVM objects uses
>>>>> vm_page (e.g. vnode), its pager looks up vm_page -> vm_physseg *
>>>>> + off_t *once* and cache it on stack.
>>> 
>>> do you mean something like this?
>>>     struct {
>>>             vm_physseg *hoge;
>>>             off_t fuga;
>>>     } foo [16];
>> 
>> Yes.
>> 
>> Or cache vm_page * with it, like:
>> 
>>      struct vm_id {
>>              vm_physseg *seg;
>>              off_t off;
>>              vm_page *pg;
>>      };
>> 
>>      uvm_fault()
>>      {
>>              vm_id pgs[];
>>              :
>>      }
>> 
>> Vnode pager (genfs_getpages) takes vm_page's by looking up
>> vnode::v_uobj's list, or uvn_findpages().
>> 
>> When it returns back to fault handler, we have to lookup vm_physseg
>> for each page.  Then fill the "seg" slot above (assuming we'll
>> remove vm_page::phys_addr soon).
>> 
>> Fault handler calls per-vm_page operations iff vm_page slot is filled.
>> XIP pages are not pageq'ed.  XIP pages don't need vm_page, but
>> cached because it's vnode.
>> 
>> (Just in case, have you read my paper?)

Still on my TODO list.  I've been preparing to merge the mips64 branch the my 
mpc85xx code.

One thing I've been thinking about is how to use an MMU's capability to map >1 
page with a single TLB entry or PTE.

If we are going with vm_physseg as a fundamental VM data structure, maybe we 
should think it about how to use it a little more.

should it have a 

struct vm_page *(*ps_off2pg)(struct vm_physseg *, u_long frame_no);
struct vm_page_md *(*ps_off2md)(struct vm_physseg *, u_long frame_no);

or similar functions?


Home | Main Index | Thread Index | Old Index