tech-kern archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: assertion "spc->spc_migrating == NULL" failed



hi,

> Hello,
> working with a source code based on the matt-nb5-mips64 branch,
> I can reproduce this panic:
> panic: kernel diagnostic assertion "spc->spc_migrating == NULL" failed: file 
> "/dsk/l1/misc/bouyer/tmp/src/sys/kern/kern_synch.c", line 656
> mttycn_pollc 1 ipl 0x6
> Stopped in pid 0.4 (system) at  netbsd:cpu_Debugger+0x4:        jr      ra
>                 bdslot: nop
> db{0}> tr
> cpu_Debugger+4 (c04bd000,b300,10,c0407c00) ra c02192ac sz 0
> panic+1d4 (c04bd000,c02de430,c02f1450,c02f1360) ra c02cac78 sz 48
> __kernassert+48 (c04bd000,c02de430,c02f1450,c02f1360) ra c01f74a4 sz 32
> mi_switch+640 (c04bd000,c02de430,c02f1450,c02f1360) ra c01f3130 sz 64
> sleepq_block+f0 (c04bd000,c02de430,c02f1450,c02f1360) ra c0202f54 sz 48
> turnstile_block+2d0 (c04bd000,c02de430,c02f1450,c02f1360) ra c01e254c sz 56
> mutex_vector_enter+268 (c04bd000,c02de430,c02f1450,c02f1360) ra c026e2cc sz 64
> wapbl_biodone+48 (c04bd000,c02de430,c02f1450,c02f1360) ra c0255638 sz 48
> biodone2+a4 (c04bd000,c02de430,c02f1450,c02f1360) ra c02557c8 sz 32
> biointr+ac (c04bd000,c02de430,c02f1450,c02f1360) ra c01f3acc sz 32
> softint_dispatch+c4 (c04bd000,c02de430,c02f1450,c02f1360) ra c0295fe4 sz 72
> softint_fast_dispatch+80 (0,c02de430,c02f1450,c02f1360) ra 0 sz 24
> User-level: pid 0.4
> 
> 
> (The soft int may vary). Looking at the sources, I see that
> sched_nextlwp() is carefull to not propose a new lwp if a migration is in
> progress. But when this KASSERT fires we're not necesserely about to
> switch to a new (non-idle) lwp, but the current lwp got woken up by another
> CPU while it was about to switch.
> 
> Shouldn't
>                         KASSERT(spc->spc_migrating == NULL);
>                         if (l->l_target_cpu !=  NULL) { 
>                                 spc->spc_migrating = l; 
>                         }
> be instead:
>                         if (l->l_target_cpu !=  NULL) { 
>                               KASSERT(spc->spc_migrating == NULL);
>                                 spc->spc_migrating = l; 
>                         }
> 
> I did the above change and it seems to work, can someone confirm this is
> correct ?

i think you're correct.

i have the attached patch long-staying in my local tree.
i haven't committed it because it hasn't been reproduced on my machine yet.

YAMAMOTO Takashi

> 
> -- 
> Manuel Bouyer <bouyer%antioche.eu.org@localhost>
>      NetBSD: 26 ans d'experience feront toujours la difference
> --
Index: kern_synch.c
===================================================================
RCS file: /cvsroot/src/sys/kern/kern_synch.c,v
retrieving revision 1.284
diff -u -p -r1.284 kern_synch.c
--- kern_synch.c        2 Nov 2010 15:17:37 -0000       1.284
+++ kern_synch.c        23 Nov 2010 22:16:57 -0000
@@ -654,9 +654,22 @@ mi_switch(lwp_t *l)
                        l->l_stat = LSRUN;
                        lwp_setlock(l, spc->spc_mutex);
                        sched_enqueue(l, true);
-                       /* Handle migration case */
-                       KASSERT(spc->spc_migrating == NULL);
-                       if (l->l_target_cpu !=  NULL) {
+#if 1
+                       if (spc->spc_migrating != NULL) {
+                               printf("%s: bug %p %p %p\n", __func__, l, newl, 
spc);
+                       }
+#endif
+                       /*
+                        * Handle migration case
+                        *
+                        * spc_migrating != NULL here means that a softint
+                        * which interrupted the idle lwp is blocking.
+                        */
+                       KASSERT(spc->spc_migrating == NULL ||
+                           ((l->l_pflag & LP_INTR) != 0 &&
+                           newl != NULL && (newl->l_flag & LW_IDLE) != 0));
+                       if (l->l_target_cpu != NULL) {
+                               KASSERT((l->l_pflag & LP_INTR) == 0);
                                spc->spc_migrating = l;
                        }
                } else


Home | Main Index | Thread Index | Old Index