NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/43561: Thread waiting, CPU idling



>Number:         43561
>Category:       kern
>Synopsis:       Thread waiting, CPU idling
>Confidential:   no
>Severity:       non-critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jul 03 08:35:00 +0000 2010
>Originator:     Witold Jan Wnuk
>Release:        NetBSD-current
>Organization:
>Environment:
NetBSD foster 5.99.33 NetBSD 5.99.33 (FOSTER) #23: Sat Jul  3 09:07:28 CEST 
2010  w@foster:/home/w/NetBSD/src/sys/arch/i386/compile/FOSTER i386

>Description:
On multiprocessor system sched_balance (sys/kern/kern_runq.c) fails to select 
CPU with one thread in run queue.
>How-To-Repeat:
Compile and run one copy for each CPU. Inspect CPU idle time.

int
main()
{
        while (1)
                ;
}


Problem described by Sad Clouds in 
http://mail-index.netbsd.org/tech-userlevel/2010/04/16/msg003515.html is likely 
also a manifestation of this.

>Fix:
Workaround - add one bit of precision to r_avgcount calculation:


Index: sys/kern/kern_runq.c
===================================================================
RCS file: /cvsroot/src/sys/kern/kern_runq.c,v
retrieving revision 1.30
diff -u -r1.30 kern_runq.c
--- sys/kern/kern_runq.c        3 Mar 2010 00:47:30 -0000       1.30
+++ sys/kern/kern_runq.c        3 Jul 2010 07:43:20 -0000
@@ -78,7 +78,7 @@
        uint32_t        r_bitmap[PRI_COUNT >> BITMAP_SHIFT];
        /* Counters */
        u_int           r_count;        /* Count of the threads */
-       u_int           r_avgcount;     /* Average count of threads */
+       u_int           r_avgcount1;    /* Average count of threads x 2 */
        u_int           r_mcount;       /* Count of migratable threads */
        /* Runqueues */
        queue_t         r_rt_queue[PRI_RT_COUNT];
@@ -523,12 +523,12 @@
                ci_rq = ci->ci_schedstate.spc_sched_info;
 
                /* Average count of the threads */
-               ci_rq->r_avgcount = (ci_rq->r_avgcount + ci_rq->r_mcount) >> 1;
+               ci_rq->r_avgcount1 = (ci_rq->r_avgcount1 + (ci_rq->r_mcount << 
1)) >> 1;
 
                /* Look for CPU with the highest average */
-               if (ci_rq->r_avgcount > highest) {
+               if (ci_rq->r_avgcount1 > highest) {
                        hci = ci;
-                       highest = ci_rq->r_avgcount;
+                       highest = ci_rq->r_avgcount1;
                }
        }
 
@@ -625,7 +625,7 @@
        }
 
        /* Reset the counter, and call the balancer */
-       ci_rq->r_avgcount = 0;
+       ci_rq->r_avgcount1 = 0;
        sched_balance(ci);
        tci = worker_ci;
        tspc = &tci->ci_schedstate;
@@ -734,7 +734,7 @@
                        return NULL;
 
                /* Reset the counter, and call the balancer */
-               ci_rq->r_avgcount = 0;
+               ci_rq->r_avgcount1 = 0;
                sched_balance(ci);
                cci = worker_ci;
                cspc = &cci->ci_schedstate;
@@ -871,14 +871,14 @@
                ci_rq = spc->spc_sched_info;
 
                (*pr)("Run-queue (CPU = %u):\n", ci->ci_index);
-               (*pr)(" pid.lid = %d.%d, r_count = %u, r_avgcount = %u, "
+               (*pr)(" pid.lid = %d.%d, r_count = %u, r_avgcount1 = %u, "
                    "maxpri = %d, mlwp = %p\n",
 #ifdef MULTIPROCESSOR
                    ci->ci_curlwp->l_proc->p_pid, ci->ci_curlwp->l_lid,
 #else
                    curlwp->l_proc->p_pid, curlwp->l_lid,
 #endif
-                   ci_rq->r_count, ci_rq->r_avgcount, spc->spc_maxpriority,
+                   ci_rq->r_count, ci_rq->r_avgcount1, spc->spc_maxpriority,
                    spc->spc_migrating);
                i = (PRI_COUNT >> BITMAP_SHIFT) - 1;
                do {



Home | Main Index | Thread Index | Old Index