Subject: Re: pmap problems in netbsd-4 on arm omap
To: None <tech-kern@netbsd.org>
From: Bucky Katz <bucky@picovex.com>
List: tech-kern
Date: 09/05/2007 14:47:30
Bucky Katz <bucky@picovex.com> writes:
I now have more detail on the problem, attached next, and a concrete
question:
are the mappings always supposed to be anonymous for asynch?
----------------------------------------------------------------------
The problem seems to always happen wheh the fault handler promotes a
page shortly after asynch io unbusy-s a bunch of pages. I can't
figure out how asynch io is supposed to remove mappings at pmap
level. Are the mappings always supposed to be anonymous for asynch
io?
I see the function uvm_aio_aiodone modifies the pages that can be
freed here.
/*
* do accounting for pagedaemon i/o and arrange to free
* the pages instead of just unbusying them.
*/
if (pg->flags & PG_PAGEOUT) {
pg->flags &= ~PG_PAGEOUT;
uvmexp.paging--;
uvmexp.pdfreed++;
pg->flags |= PG_RELEASED;
}
and then unbusy is called, which calls uvm_pagefree on the pages with
PG_RELEASED set
if (pg->flags & PG_RELEASED) {
UVMHIST_LOG(ubchist, "releasing pg %p", pg,0,0,0);
KASSERT(pg->uobject != NULL ||
(pg->uanon != NULL && pg->uanon->an_ref > 0));
pg->flags &= ~PG_RELEASED;
// XXXX debug added XXXX
if (pmap_has_mappings(pg->phys_addr)) {
printf("uvm_page_unbusy : physical page 0x%08x has
mappings\n", (unsigned int)pg->phys_addr);
}
uvm_pagefree(pg);
}
then returns - no pmap_remove() as far as I can tell.
Here is my debug log, and the panic
uvm_page_unbusy : physical page 0x1107b000 has mappings
XXXXX : placing page 0x1107b000 with mappings on freelist
uvm_page_unbusy : physical page 0x1107a000 has mappings
XXXXX : placing page 0x1107a000 with mappings on freelist
uvm_page_unbusy : physical page 0x11079000 has mappings
XXXXX : placing page 0x11079000 with mappings on freelist
uvm_page_unbusy : physical page 0x11078000 has mappings
XXXXX : placing page 0x11078000 with mappings on freelist
uvm_page_unbusy : physical page 0x11077000 has mappings
XXXXX : placing page 0x11077000 with mappings on freelist
uvm_page_unbusy : physical page 0x11076000 has mappings
XXXXX : placing page 0x11076000 with mappings on freelist
uvm_page_unbusy : physical page 0x11075000 has mappings
XXXXX : placing page 0x11075000 with mappings on freelist
uvm_page_unbusy : physical page 0x11074000 has mappings
XXXXX : placing page 0x11074000 with mappings on freelist
uvm_page_unbusy : physical page 0x11073000 has mappings
XXXXX : placing page 0x11073000 with mappings on freelist
uvm_page_unbusy : physical page 0x11072000 has mappings
XXXXX : placing page 0x11072000 with mappings on freelistu
vm_page_unbusy : physical page 0x11071000 has mappings
XXXXX : placing page 0x11071000 with mappings on freelist
uvm_page_unbusy : physical page 0x11070000 has mappings
XXXXX : placing page 0x11070000 with mappings on freelist
uvm_page_unbusy : physical page 0x1106f000 has mappings
XXXXX : placing page 0x1106f000 with mappings on freelist
panic: pmap_zero_page: page (0x11074000) has mappings
0 -> panic+0x110
1 -> pmap_zero_page_generic+0x148
2 -> uvm_pagealloc_strat+0x2a4
3 -> uvmfault_promote+-x168
4 -> uvm_fault_internal+0x12b8
5 -> data_abort_handler+0x31c
6 -> address_exception_entry+0x50
7 -> 0x253815
8 -> 0x11770c
9 -> 0x116210
10 -> 0x1162a8
11 -> 0x159fc4
> Hi,
>
> One of our developers is working on a new omap dev board and running
> into problems with pmap issues. He asked the following questions, and
> I'm afraid I don't know the answers. Any help is most welcome.
>
> He is seeing a pmap panic periodically:
>
> pmap_zero_page: page xxxx has mappings.
>
> Preliminary investigation involved utilizing uvm_hist and indicated a
> physical page first has a managed mappping. It then got an anonymous
> mapping. The anonymous mapping was later removed, and the physical
> page was placed on the free queue. Later, the page is selected from
> uvm_pagealloc_strat, and pmap_zero_page is called, resulting in the
> panic.
>
> Secondary investigation involved trying to add a panic where the
> page is actually placed on the free queue. The comments for the
> function uvm_pagefree states that it assumes all valid mappings of pg
> are gone. I created a pmap_has_mappings() function which basically
> returns nonzero if (vm_page*)pg->mdpage.pvh_list != NULL. This causes
> a panic just starting up /etc/init
>
> I found that uvm_km_free() for instance, orders the function calls as
> such:
>
> uvm_km_pgremove(addr, addr + size);
> pmap_remove(pmap_kernel(), addr, addr + size);
>
>
> Therefore, my testing in uvm_pagefree is done prior to pmap removing
> the managed mapping. Is there a reason for the above ordering?
>
> For testing purposes, I've reversed the order, and I get further
> along, but eventually hit a similar problem in uvm_page_unbusy(). I'm
> working around this, now.
>
> I'm not sure if any locks are acquired by uvm_km_free() and I'm
> wondering if there may be a locking hole somewhere where a physical
> page may be placed on the free list before its pmap layer mapping is
> removed. Has anyone encountered such problems before?