NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: kern/47881 (kernel diagnostic assertion "c->c_cpu->cc_lwp == curlwp || c->c_cpu->cc_active != c" failed)



Okay, I got it.

The assertion failure on CALLOUT_PENDING occurs because
the callout (mld_timeo) calls callout_schedule _during_
callout_halt.

[players]
- [A] A LWP that calls callout_destroy
- [B] Callout (mld_timeo)

[events]
- [A] holds softnet_lock
- [B] is executed and waits for softnet_lock
- [A] calls callout_halt and releases softnet_lock
- [B] resumes and calls callout_schedule
  - it sets CALLOUT_PENDING
- [B] finishs and releases softnet_lock
- [A] is resumed and callout_destroy
  - fails the assertion on CALLOUT_PENDING

So the fix for the issue is like this:

diff --git a/sys/netinet6/mld6.c b/sys/netinet6/mld6.c
index 33b8d89..3cd2951 100644
--- a/sys/netinet6/mld6.c
+++ b/sys/netinet6/mld6.c
@@ -195,6 +195,8 @@ mld_starttimer(struct in6_multi *in6m)
 {
        struct timeval now;

+       KASSERT(in6m->in6m_timer != IN6M_TIMER_UNDEF);
+
        microtime(&now);
        in6m->in6m_timer_expire.tv_sec = now.tv_sec + in6m->in6m_timer / hz;
        in6m->in6m_timer_expire.tv_usec = now.tv_usec +
@@ -227,6 +229,9 @@ mld_timeo(void *arg)
        mutex_enter(softnet_lock);
        KERNEL_LOCK(1, NULL);

+       if (in6m->in6m_timer == IN6M_TIMER_UNDEF)
+               goto out;
+
        in6m->in6m_timer = IN6M_TIMER_UNDEF;

        switch (in6m->in6m_state) {
@@ -238,6 +243,7 @@ mld_timeo(void *arg)
                break;
        }

+out:
        KERNEL_UNLOCK_ONE(NULL);
        mutex_exit(softnet_lock);
 }
@@ -741,6 +747,8 @@ in6_delmulti(struct in6_multi *in6m)
                 */
                sockaddr_in6_init(&sin6, &in6m->in6m_addr, 0, 0, 0);
                if_mcast_op(in6m->in6m_ifp, SIOCDELMULTI, sin6tosa(&sin6));
+               in6m->in6m_timer = IN6M_TIMER_UNDEF;
+               callout_halt(&in6m->in6m_timer_ch, softnet_lock);
                callout_destroy(&in6m->in6m_timer_ch);
                free(in6m, M_IPMADDR);
        }


On Tue, Nov 11, 2014 at 6:13 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
> Oops. I've got another assertion failure:
>
> panic: kernel diagnostic assertion "(c->c_flags & CALLOUT_PENDING) ==
> 0" failed: file
> "/disk3/home/ozaki-r/git/netbsd-src/sys/kern/kern_timeout.c", line 312
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 rip ffffffff80198abd cs 8 rflags 246 cr2
> ffff800002f7ffe0 ilevel 6 rsp fffffe8001ce5778
> curlwp 0xfffffe80031206e0 pid 9911.1 lowest kstack 0xfffffe8001ce22c0
> Stopped in pid 9911.1 (ifconfig) at     netbsd:breakpoint+0x5:  leave
> db{0}> bt
> breakpoint() at netbsd:breakpoint+0x5
> vpanic() at netbsd:vpanic+0x13c
> kern_assert() at netbsd:kern_assert+0x4f
> callout_destroy() at netbsd:callout_destroy+0xd8
> in6_delmulti() at netbsd:in6_delmulti+0x189
> in6_leavegroup() at netbsd:in6_leavegroup+0x15
> in6_purgeaddr() at netbsd:in6_purgeaddr+0x59
> if_purgeaddrs() at netbsd:if_purgeaddrs+0x35
> in6_purgeif() at netbsd:in6_purgeif+0x19
> udp6_purgeif_wrapper() at netbsd:udp6_purgeif_wrapper+0x35
>
> It seems that callout_schedule is called between callout_halt
> and callout_destroy. Should we need mutual execution between
> callout_schedule and callout_halt/callout_destroy somehow?
>
> On Tue, Nov 11, 2014 at 5:44 PM, Martin Husemann <martin%duskware.de@localhost> wrote:
>>
>> On Tue, Nov 11, 2014 at 05:42:56PM +0900, Ryota Ozaki wrote:
>> > BTW, there are many callout_stop & callout_destroy. Should we replace them
>> > with callout_halt & callout_destroy?
>>
>>
>> IMHO we should - maybe bring this up on tech-kern and do a full tree sweep?
>
> I'll do later.
>
> Thanks,
>   ozaki-r


Home | Main Index | Thread Index | Old Index