Current-Users archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Panic on evbarm triggered by dumpfs
On Wed, 15 Jan 2014, Petri Laakso wrote:
> I was able to do that on second try. lots of output below.
> Yesterday I disabled logging, but it didn't change anything.
>
> Now when I tried hard, I was able to crash the system without shutdown.
>
> # Second try @ single user read-only
>
> [repeated calls to dumpfs, only picked up interesting ones below]
> # dumpfs /dev/rld0a
> dumpfs: /dev/rld0a: could not find superblock, skipped
> # dumpfs /dev/rld0a
> [1] Bus error dumpfs /dev/rld0a
> # dumpfs /dev/rld0a
> dumpfs: (null): could not find superblock, skipped
> # dumpfs /dev/rld0a
> dumpfs: /dev/rld0a: could not find superblock, skipped
> # scan_ffs /dev/rld0a
> Disk: STORAGE DEVICE fictitious
> Total sectors on disk: 3932160
>
> panic: pool_get: pvepl: page empty
> Stopped in pid 46.1 (scan_ffs) at netbsd:cpu_Debugger+0x4: bx
> r
> 14
> db> bt
> 0xcbab9c1c: netbsd:vpanic+0x10
> 0xcbab9c34: netbsd:printf_nolog
> 0xcbab9c6c: netbsd:pool_get+0x304
> 0xcbab9cc8: netbsd:pmap_enter+0x748
> 0xcbab9d00: netbsd:vmapbuf+0xbc
> 0xcbab9d60: netbsd:physio+0x28c
> 0xcbab9d80: netbsd:ldread+0x40
> 0xcbab9da0: netbsd:cdev_read+0x40
> 0xcbab9e04: netbsd:spec_read+0x6c
> 0xcbab9e14: netbsd:ufsspec_read+0x44
> 0xcbab9e3c: netbsd:VOP_READ+0x38
> 0xcbab9e64: netbsd:vn_read+0x84
> 0xcbab9eb4: netbsd:dofileread+0x84
> 0xcbab9eec: netbsd:sys_pread+0xa0
> 0xcbab9f80: netbsd:syscall+0x88
> 0xcbab9fac: netbsd:swi_handler+0x9c
> db>
I don't think this really has much to do with the filesystem other than it
triggering the latent problem.
This is coming from pmap_enter() which trying to allocate a pv structure
to hold the physical->virtual mapping information for a page that is
probably being added to the kernel pmap. The pool being used to allocate
the pv entries is upset about something. The two places it will panic
are here:
if (pp->pr_roflags & PR_NOTOUCH) {
#ifdef DIAGNOSTIC
if (__predict_false(ph->ph_nmissing ==
pp->pr_itemsperpage)) {
mutex_exit(&pp->pr_lock);
panic("pool_get: %s: page empty", pp->pr_wchan);
}
#endif
v = pr_item_notouch_get(pp, ph);
} else {
v = pi = LIST_FIRST(&ph->ph_itemlist);
if (__predict_false(v == NULL)) {
mutex_exit(&pp->pr_lock);
panic("pool_get: %s: page empty", pp->pr_wchan);
}
#ifdef DIAGNOSTIC
if (__predict_false(pp->pr_nitems == 0)) {
mutex_exit(&pp->pr_lock);
printf("pool_get: %s: items on itemlist, nitems
%u\n",
pp->pr_wchan, pp->pr_nitems);
panic("pool_get: nitems inconsistent");
}
#endif
Unfortunately both have the same panic string so it's difficult to tell
them apart. I think PR_NOTOUCH is probably not set so it's likely the
second panic.
Can you toggle DIAGNOSTIC and see if there is a change in the behavior?
If you hit the first panic, disabling DIAGNOSTIC will make it go away
(although things may crash a different way later). If you hit the second
panic, DIAGNOSTIC may give more useful information about something going
wrong earlier.
Anyway, let's assuming you're hitting the second panic. A pool should
maintain a pointer to a page that has free entries. The page has a pool
header which has a list of the free entries on that page. In this case
the list is empty. This is probably due to something stomping on the page
in question.
The ARM pv pool is special in the sense that it has a custom page
allocator so it can be used before the VM subsystem is initialized.
Here's the routine in question:
static void *
pmap_bootstrap_pv_page_alloc(struct pool *pp, int flags)
{
extern void *pool_page_alloc(struct pool *, int);
vaddr_t new_page;
void *rv;
if (pmap_initialized)
return (pool_page_alloc(pp, flags));
if (free_bootstrap_pages) {
rv = free_bootstrap_pages;
free_bootstrap_pages = *((void **)rv);
return (rv);
}
new_page = uvm_km_alloc(kernel_map, PAGE_SIZE, 0,
UVM_KMF_WIRED | ((flags & PR_WAITOK) ? 0 : UVM_KMF_NOWAIT));
KASSERT(new_page > last_bootstrap_page);
last_bootstrap_page = new_page;
return ((void *)new_page);
}
It may be that the page in question is one of the bootstrap pages and
it's been lost and is being used in some other way.
Anyway if you can figure out the address of the pool header that's causing
the problem and dump its contents maybe we can get some idea about what's
stepping on that page.
Eduardo
Home |
Main Index |
Thread Index |
Old Index