NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-evbarm/55639: Assertion "anon != NULL && anon->an_ref != 0" fails on evbarm-earmv7hf
The following reply was made to PR port-evbarm/55639; it has been noted by GNATS.
From: Chuck Silvers <chuq%chuq.com@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc:
Subject: Re: port-evbarm/55639: Assertion "anon != NULL && anon->an_ref != 0"
fails on evbarm-earmv7hf
Date: Thu, 3 Sep 2020 21:10:09 -0700
On Thu, Sep 03, 2020 at 06:55:00AM +0000, Andreas Gustafsson wrote:
> >Synopsis: Assertion "anon != NULL && anon->an_ref != 0" fails on evbarm-earmv7hf
...
> panic: kernel diagnostic assertion "uvm_pagelookup(uobj, offset) == NULL || ((a->ar_flags & UVM_PAGE_ARRAY_FILL_DIRTY) != 0 && !uvm_obj_page_dirty_p(pg))" failed: file "/tmp/bracket/build/2020.08.14.09.06.15-evbarm-earmv7hf/src/sys/uvm/uvm_vnode.c", line 321
you're talking about two different assertions here.
the one about "uvm_pagelookup ..." was fixed by rev 1.117 of uvm_vnode.c.
the one about "anon != NULL ..." is completely different.
I can reproduce the latter amap corruption problem, but only on certain
arm boards. a jetson tk1 does not hit it, but a cubietruck hits it quite easily.
it's good to know that the emulated system in qemu can also hit it.
it looks like the qemu configuration used by anita is trying to have
two CPUs, but the second one isn't actually there:
[ 1.0000000] cpu1 at cpus0: disabled (unresponsive)
that's helpful in that it tells us the bug is not an MP race.
the nature of the amap corruption that I've seen on cubietruck is
a bit-flip in one of the entries in the amap's am_slots[] array,
which causes different symptoms depending on exactly what is in the amap.
I wrote some debug code to fully validate an amap immediately after
locking it and immediately before unlocking it, and this problem is
detected by the check immediately after locking the amap,
ie. the bit is being flipped while the amap is not locked,
so it's very unlikely that the code that operates on amaps
is causing the corruption.
I wrote some more debug code to make the mappings of all of the
amap arrays read-only while the amap is not locked, but then
I don't hit the problem.
today I tried running the atf tests on cubietruck again with
the uvm/radixtree commit that you reference reverted, and I still hit
the same assertion in amap_wipeout() that the anita harness did.
so it appears that this is an old bug, which is perhaps made more
more likely to trigger an assertion by recent changes.
-Chuck
Home |
Main Index |
Thread Index |
Old Index