NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/49793: nd6 panic on interface transition



The following reply was made to PR kern/49793; it has been noted by GNATS.

From: Ryota Ozaki <ozaki-r%netbsd.org@localhost>
To: billc%netbsd.org@localhost, gnats-bugs%netbsd.org@localhost
Cc: kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/49793: nd6 panic on interface transition
Date: Mon, 30 Mar 2015 00:23:24 +0900

 Hi,
 
 The issue is that the interrupt handler calls if_link_state_change
 that may acquire an adaptive mutex. AFAIK, the issue still exists
 in -current and some other (many?) drivers have the same issue.
 
 One way to fix the issue is to let if_link_state_change do its task
 asynchronously by using a softint, something like FreeBSD does.
 
 Any other ideas?
   ozaki-r
 
 On Fri, Mar 27, 2015 at 6:35 PM,  <billc%netbsd.org@localhost> wrote:
 >>Number:         49793
 >>Category:       kern
 >>Synopsis:       nd6 panic on interface transition
 >>Confidential:   no
 >>Severity:       serious
 >>Priority:       medium
 >>Responsible:    kern-bug-people
 >>State:          open
 >>Class:          sw-bug
 >>Submitter-Id:   net
 >>Arrival-Date:   Fri Mar 27 09:35:00 +0000 2015
 >>Originator:     Cryo
 >>Release:        NetBSD-7
 >>Organization:
 > NetBSD.org
 >>Environment:
 > NetBSD eeep.satx.warped.com 7.0_BETA NetBSD 7.0_BETA (DRMKMS) #0: Tue Mar=
  17 16:57:06 MDT 2015  root%glurmo.warped.com@localhost:/usr/build/netbsd-7_i386/obj/=
 sys/arch/i386/compile/DRMKMS i386
 >>Description:
 > eeep# cat crash.txt
 > Mar 25 11:00:01 eeep syslogd[611]: restart
 > Mar 25 22:47:06 eeep dhcpcd[367]: ath0: fe80::aeb3:13ff:fe89:8ed7 is unre=
 achable, expiring it
 > Mar 25 22:47:06 eeep dhcpcd[367]: ath0: fe80::aeb3:13ff:fe89:8ed7 is reac=
 hable again
 > Mar 26 03:57:42 eeep syslogd[685]: restart
 > Mar 26 03:57:42 eeep /netbsd: ath0: link state DOWN (was UP)
 > Mar 26 03:57:42 eeep /netbsd: panic: LOCKDEBUG: Mutex error: lockdebug_wa=
 ntlock: acquiring sleep lock from interrupt context
 > Mar 26 03:57:42 eeep /netbsd: cpu0: Begin traceback...
 > Mar 26 03:57:42 eeep /netbsd: vpanic(c0e82fcf,dacf09fc,dacf0a1c,c0975a4e,=
 c0e82fcf,c0e3e600,c0daab73,c0e8320c,1,c0daab73) at netbsd:vpanic+0x121
 > Mar 26 03:57:42 eeep /netbsd: snprintf(c0e82fcf,c0e3e600,c0daab73,c0e8320=
 c,1,c0daab73,c10fe5e0,c4eced40,c10fe654,dacf0a78) at netbsd:snprintf
 > Mar 26 03:57:42 eeep /netbsd: lockdebug_more(c0e8320c,1,0,c4ecef4c,2,10,c=
 10c85be,c4eced40,101,2) at netbsd:lockdebug_more
 > Mar 26 03:57:42 eeep /netbsd: mutex_enter(c10fe654,c3cf40d0,dacf0ab0,c08b=
 1ae0,101,c3d0f018,c10fe654,2,dacf0ac4,dacf0b8c) at netbsd:mutex_enter+0x46b
 > Mar 26 03:57:42 eeep /netbsd: pool_get(c10fe5e0,2,c4ecef4c,2,c3cf408c,c10=
 c85be,c3cf40d0,dacf0c00,c3bde408,6) at netbsd:pool_get+0x4f
 > Mar 26 03:57:42 eeep /netbsd: rtrequest1(1,dacf0b8c,dacf0c00,0,c3d0f018,c=
 3cf40d0,dacf0c04,0,0,0) at netbsd:rtrequest1+0x207
 > Mar 26 03:57:42 eeep /netbsd: rtrequest(1,c3d0f018,c3cf40d0,dacf0c04,101,=
 dacf0c00,0,0,0,337f980) at netbsd:rtrequest+0x43
 > Mar 26 03:57:42 eeep /netbsd: nd6_prefix_onlink(c3d0f00c,0,0,0,0,1,2,c3ac=
 f030,dacf0c70,c0570aed) at netbsd:nd6_prefix_onlink+0xfb
 > Mar 26 03:57:42 eeep /netbsd: pfxlist_onlink_check(0,1,2,1,dacf0c9c,c043d=
 bdb,c3acf030,c0e0d0ce,c3acf044,c0e3999b) at netbsd:pfxlist_onlink_check+0x1=
 eb
 > Mar 26 03:57:42 eeep /netbsd: in6_if_link_down(c3acf030,c0e0d0ce,c3acf044=
 ,c0e3999b,c0e38f6f,6,c3acf488,c420a00c,c3acf030,dacf0ccc) at netbsd:in6_if_=
 link_down+0xc
 > Mar 26 03:57:42 eeep /netbsd: if_link_state_change(c3acf030,1,0,0,4,dacf0=
 ccc,87e54b56,c3acf488,3,4) at netbsd:if_link_state_change+0x128
 > Mar 26 03:57:42 eeep /netbsd: ieee80211_notify_node_leave(c3acf488,c420a0=
 0c,c1020ec0,db918000,24,1,c420a00c,c3acf000,4,c3c54000) at netbsd:ieee80211=
 _notify_node_leave+0xea
 > Mar 26 03:57:42 eeep /netbsd: ieee80211_newstate(c3acf488,3,a0,c110376c,c=
 3516f80,7,0,c110376c,dacf0d48,c0b41a9d) at netbsd:ieee80211_newstate+0x40f
 > Mar 26 03:57:42 eeep /netbsd: ath_newstate(c3acf488,3,a0,c3516f82,c06b12a=
 9,0,c10c8a60,c3bf9264,c10c8a60,c3bf9264) at netbsd:ath_newstate+0x4cb
 > Mar 26 03:57:42 eeep /netbsd: ieee80211_recv_mgmt(c3acf488,c3e02000,c420a=
 00c,a0,1f,5c4b,1f,a0,c3acf488,c3acf488) at netbsd:ieee80211_recv_mgmt+0x1be
 > Mar 26 03:57:42 eeep /netbsd: ath_recv_mgmt(c3acf488,c3e02000,c420a00c,a0=
 ,1f,5c4b,c3acfb24,c3acfb20,c3acfb24,c3acfb24) at netbsd:ath_recv_mgmt+0x46
 > Mar 26 03:57:42 eeep /netbsd: ieee80211_input(c3acf488,c3e02000,c420a00c,=
 1f,5c4b,0,db929490,c3acffa4,0,c3ad02d4) at netbsd:ieee80211_input+0x8bf
 > Mar 26 03:57:42 eeep /netbsd: ath_rx_proc(c3acf000,1,c4eced40,1,c380cf58,=
 0,dacf0f6c,c05da518,c3acf000,c380cf58) at netbsd:ath_rx_proc+0x5b8
 > Mar 26 03:57:42 eeep /netbsd: ath_intr(c3acf000,c380cf58,0,c3bded08,c0108=
 2a5,c380cf58,ddea5fa4,0,0,0) at netbsd:ath_intr+0x25b
 > Mar 26 03:57:42 eeep /netbsd: intr_biglock_wrapper(c380cf58,ddea5fa4,0,0,=
 0,0,0,0,0,0) at netbsd:intr_biglock_wrapper+0x1f
 > Mar 26 03:57:42 eeep /netbsd: --- switch to interrupt stack ---
 > Mar 26 03:57:42 eeep /netbsd: Xintr_ioapic_level7() at netbsd:Xintr_ioapi=
 c_level7+0xb5
 > Mar 26 03:57:42 eeep /netbsd: --- interrupt ---
 > Mar 26 03:57:42 eeep /netbsd: 81cd377:
 > Mar 26 03:57:42 eeep /netbsd: cpu0: End traceback...
 > Mar 26 03:57:42 eeep /netbsd:
 > Mar 26 03:57:42 eeep /netbsd: dumping to dev 0,1 offset 2984
 >
 > Mar 26 03:57:47 eeep savecore: reboot after panic: panic: LOCKDEBUG: Mute=
 x error: lockdebug_wantlock: acquiring sleep lock from interrupt context
 > Mar 26 03:57:47 eeep savecore: system went down at Thu Mar 26 03:52:05 20=
 15
 > Mar 26 03:57:47 eeep savecore: /var/crash/bounds: No such file or directo=
 ry
 > Mar 26 03:57:48 eeep savecore: writing compressed core to /var/crash/netb=
 sd.0.core.gz
 >
 >
 >
 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 >
 >
 > eeep# cat crash2.txt
 > ath0: link state UP (was UNKNOWN)
 > ath0: device timeout (txq 1, txintrperiod 5)
 > ath0: device timeout (txq 1, txintrperiod 4)
 > ath0: device timeout (txq 1, txintrperiod 3)
 > ath0: device timeout (txq 1, txintrperiod 2)
 > ath0: link state DOWN (was UP)
 > panic: LOCKDEBUG: Mutex error: lockdebug_wantlock: acquiring sleep lock f=
 rom interrupt context
 > cpu0: Begin traceback...
 > vpanic(c0e82fcf,dacf09fc,dacf0a1c,c0975a4e,c0e82fcf,c0e3e600,c0daab73,c0e=
 8320c,1,c0daab73) at netbsd:vpanic+0x121
 > snprintf(c0e82fcf,c0e3e600,c0daab73,c0e8320c,1,c0daab73,c10fe5e0,ccb1daa0=
 ,c10fe654,dacf0a78) at netbsd:snprintf
 > lockdebug_more(c0e8320c,1,0,ccb1dcac,2,10,c10c85be,ccb1daa0,101,2) at net=
 bsd:lockdebug_more
 > mutex_enter(c10fe654,c3d81610,dacf0ab0,c08b1ae0,101,c49c9b98,c10fe654,2,d=
 acf0ac4,dacf0b8c) at netbsd:mutex_enter+0x46b
 > pool_get(c10fe5e0,2,ccb1dcac,2,c3d815cc,c10c85be,c3d81610,dacf0c00,c3bdf4=
 08,6) at netbsd:pool_get+0x4f
 > rtrequest1(1,dacf0b8c,dacf0c00,0,c49c9b98,c3d81610,dacf0c04,0,0,0) at net=
 bsd:rtrequest1+0x207
 > rtrequest(1,c49c9b98,c3d81610,dacf0c04,101,dacf0c00,0,0,0,337f980) at net=
 bsd:rtrequest+0x43
 > nd6_prefix_onlink(c49c9b8c,0,0,0,0,1,2,c3ad0030,dacf0c70,c0570aed) at net=
 bsd:nd6_prefix_onlink+0xfb
 > pfxlist_onlink_check(0,1,2,1,dacf0c9c,c043dbdb,c3ad0030,c0e0d0ce,c3ad0044=
 ,c0e3999b) at netbsd:pfxlist_onlink_check+0x1eb
 > in6_if_link_down(c3ad0030,c0e0d0ce,c3ad0044,c0e3999b,c0e38f6f,6,c3ad0488,=
 c3fc700c,c3ad0030,dacf0ccc) at netbsd:in6_if_link_down+0xc
 > if_link_state_change(c3ad0030,1,0,0,4,dacf0ccc,29bc7621,c3ad0488,2,4) at =
 netbsd:if_link_state_change+0x128
 > ieee80211_notify_node_leave(c3ad0488,c3fc700c,c1020ec0,db91e000,24,1,c3fc=
 700c,c3ad0000,4,c3c55000) at netbsd:ieee80211_notify_node_leave+0xea
 > ieee80211_newstate(c3ad0488,2,c0,c110376c,c3517f80,7,0,c110376c,dacf0d48,=
 c0b41a9d) at netbsd:ieee80211_newstate+0x37d
 > ath_newstate(c3ad0488,2,c0,c3517f82,c06b12a9,0,c10c8a60,c3bfa264,c10c8a60=
 ,c3bfa264) at netbsd:ath_newstate+0x4cb
 > ieee80211_recv_mgmt(c3ad0488,c3f9a100,c3fc700c,c0,2a,1b7d,2a,c0,c3ad0488,=
 c3ad0488) at netbsd:ieee80211_recv_mgmt+0x8be
 > ath_recv_mgmt(c3ad0488,c3f9a100,c3fc700c,c0,2a,1b7d,c3ad0b24,c3ad0b20,c3a=
 d0b24,c3ad0b24) at netbsd:ath_recv_mgmt+0x46
 > ieee80211_input(c3ad0488,c3f9a100,c3fc700c,2a,1b7d,0,daf3b0e8,c3ad0fa4,0,=
 c3ad12d4) at netbsd:ieee80211_input+0x8bf
 > ath_rx_proc(c3ad0000,1,ccb1daa0,1,c3e1d080,0,dacf0f6c,c05da518,c3ad0000,c=
 3e1d080) at netbsd:ath_rx_proc+0x5b8
 > ath_intr(c3ad0000,c3e1d080,0,c3bdfd08,c01082a5,c3e1d080,dc678fa4,0,ffffff=
 ff,ffffffff) at netbsd:ath_intr+0x25b
 > intr_biglock_wrapper(c3e1d080,dc678fa4,0,ffffffff,ffffffff,0,0,ffffffff,f=
 fffffff,0) at netbsd:intr_biglock_wrapper+0x1f
 > --- switch to interrupt stack ---
 > Xintr_ioapic_level7() at
 > eeep#
 >
 >
 >>How-To-Repeat:
 > Wait for DHCP to timeout, or for the interface to transition (unexpectedl=
 y) which causes a fetch for a new address on a down interface?  I can't tel=
 l from the traceback if the interface is actually back up or still down on =
 the 2nd crash. 1st crash it appears to have come back up and is reachable a=
 gain.  2nd one looks like it died as going down.
 >
 >>Fix:
 > pray for stable wifi and a never ending dhcp lease.
 >
 


Home | Main Index | Thread Index | Old Index