NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
The following reply was made to PR kern/50186; it has been noted by GNATS.
From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
To: gnats-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Subject: Re: kern/50186: sparc memfault panic after 7.99.21 ARP changes
Date: Tue, 1 Sep 2015 15:52:45 +0900
Hi,
On Tue, Sep 1, 2015 at 12:40 PM, <jdbaker%mylinuxisp.com@localhost> wrote:
>>Number: 50186
>>Category: kern
>>Synopsis: sparc memfault panic after 7.99.21 ARP changes
>>Confidential: no
>>Severity: critical
>>Priority: high
>>Responsible: kern-bug-people
>>State: open
>>Class: sw-bug
>>Submitter-Id: net
>>Arrival-Date: Tue Sep 01 03:40:00 +0000 2015
>>Originator: John D. Baker
>>Release: NetBSD/sparc-7.99.21
>>Organization:
>>Environment:
> NetBSD jean.technoskunk.fur 7.99.21 NetBSD 7.99.21 (JEAN) #0: Mon Aug 31 20:21:50 CDT 2015 sysop%skuld.technoskunk.fur@localhost:/d0/build/current/obj/sparc/sys/arch/sparc/compile/JEAN sparc
>
> NetBSD jean.technoskunk.fur 7.99.21 NetBSD 7.99.21 (GENERIC) #19: Mon Aug 31 20:03:50 CDT 2015 sysop%skuld.technoskunk.fur@localhost:/d0/build/current/obj/sparc/sys/arch/sparc/compile/GENERIC sparc
>
>>Description:
> Following the changes to ARP cache handling beginning with the
> following commit:
>
> http://mail-index.netbsd.org/source-changes/2015/08/31/msg068612.html
>
> sparc platform will panic after an indeterminate time (probably when
> about to expire an ARP entry) as follows:
>
> From custom kernel JEAN:
>
> cpu0: data fault: pc=0xf008350c addr=0x10 sfsr=0x326<PERR=0x0,LVL=0x3,AT=0x1,FT=0x1,FAV,OW>
> panic: kernel fault
> Stopped in pid 0.5 (system) at netbsd:cpu_Debugger+0x4: or %
> o7, %g0, %g1
> db> bt
> cpu_Debugger(0xf03a4758, 0xf99efd20, 0xf0432400, 0xf04331a8, 0xf0433000, 0x104) a
> t netbsd:panic+0x20
> panic(0xf03a4758, 0x0, 0xf008350c, 0x10, 0xf99efd40, 0xf040cc00) at netbsd:mem_a
> ccess_fault4m+0x5a4
> mem_access_fault4m(0x9, 0x326, 0x10, 0xf99efde0, 0xf0409ff0, 0xf0a0d540) at netb
> sd:memfault_sun4m+0xe8
> memfault_sun4m(0xf0b366ac, 0x1, 0x0, 0xf041e318, 0xf0a0d544, 0xf0a0d544) at netb
> sd:arptimer+0x6c
> arptimer(0xf0b36600, 0xf0a0d540, 0xf0b39008, 0x0, 0xf0b366ac, 0xf0437800) at net
> bsd:callout_softclock+0x154
> callout_softclock(0xf041e31c, 0x1000000, 0x10000, 0xf041e318, 0xf0b36600, 0xf008
> 3478) at netbsd:softint_thread+0x94
> softint_thread(0xf0a0d540, 0x3000, 0x2000, 0x0, 0x0, 0xf99e8218) at netbsd:lwp_t
> rampoline+0x8
> db>
>
>
> From GENERIC:
>
> cpu0: data fault: pc=0xf00a626c addr=0x10 sfsr=0x326<PERR=0x0,LVL=0x3,AT=0x1,FT=0x1,FAV,OW>
> panic: kernel fault
> Stopped in pid 0.5 (system) at netbsd:cpu_Debugger+0x4: or %
> o7, %g0, %g1
> db> bt
> cpu_Debugger(0xf03efb58, 0xf9ac7d20, 0xf0482c00, 0xf0483a58, 0xf0483800, 0x104) a
> t netbsd:panic+0x20
> panic(0xf03efb58, 0x0, 0xf00a626c, 0x10, 0xf9ac7d40, 0xf045c800) at netbsd:mem_a
> ccess_fault4m+0x5b0
> mem_access_fault4m(0x9, 0x326, 0x10, 0xf9ac7de0, 0xf0459b20, 0xf0a60540) at netb
> sd:memfault_sun4m+0xe8
> memfault_sun4m(0xf0b8852c, 0x1, 0x0, 0xf04712a0, 0xf0a60544, 0xf0a60544) at netb
> sd:arptimer+0x6c
> arptimer(0xf0b88480, 0xf0a60540, 0xf0b8c808, 0x0, 0xf0b8852c, 0xf0488800) at net
> bsd:callout_softclock+0x154
> callout_softclock(0xf04712a4, 0x1000000, 0x10000, 0xf04712a0, 0xf0b88480, 0xf00a
> 61d8) at netbsd:softint_thread+0x94
> softint_thread(0xf0a60540, 0x3000, 0x2000, 0x0, 0x0, 0xf9ac0218) at netbsd:lwp_t
> rampoline+0x8
> db>
>
> Machine is SPARCstation 5, 110Mhz, 256MB RAM. Operating diskless.
> (NetBSD-7.0_RC3 on local disk)
>
> I hope to confirm this observation on another system, but it is
> engaged in another task at this time.
>>How-To-Repeat:
> Build sparc release from 201509010100 or later and boot GENERIC.
>>Fix:
>
I investigated where it happens:
----
$ ~/git/netbsd-src/work.tools/sparc--netbsdelf/bin/nm -n
work.sparc/sys/arch/sparc/compile/GENERIC/netbsd |grep arptimer
f00a61d8 t arptimer
$ ruby -e 'puts (0xf00a61d8 + 0x6c).to_s(16)'
f00a6244
$ ~/git/netbsd-src/work.tools/sparc--netbsdelf/bin/objdump -d -S
work.sparc/sys/arch/sparc/compile/GENERIC/netbsd.gdb |grep -10
f00a6244
ifp = lle->lle_tbl->llt_ifp;
f00a6234: c2 06 20 40 ld [ %i0 + 0x40 ], %g1
callout_stop(&lle->la_timer);
f00a6238: 90 10 00 1b mov %i3, %o0
f00a623c: 40 03 34 68 call f01733dc <callout_stop>
f00a6240: f4 00 60 10 ld [ %g1 + 0x10 ], %i2
/* XXX: LOR avoidance. We still have ref on lle. */
LLE_WUNLOCK(lle);
f00a6244: 40 02 f7 63 call f0163fd0 <rw_exit>
f00a6248: 90 10 00 1c mov %i4, %o0
/*
* Free an arp entry.
*/
static void arptfree(struct llentry *la)
{
struct rtentry *rt = la->la_rt;
f00a624c: f6 06 20 b0 ld [ %i0 + 0xb0 ], %i3
KASSERT(rt != NULL);
----
Hmm, the place calling rw_exit? Or just before/after it?
I'm not familiar with sparc so I may be wrong on the
investigation.
Thanks,
ozaki-r
Home |
Main Index |
Thread Index |
Old Index