NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/51615: Userland processes not evenly distributed on all CPUs



>Number:         51615
>Category:       kern
>Synopsis:       Userland processes not evenly distributed on all CPUs
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 08 23:50:00 +0000 2016
>Originator:     hubert%feyrer.de@localhost
>Release:        NetBSD 7.0_STABLE + -current, both as of 20161108
>Organization:
	
>Environment:
	1) NetBSD vmnetbsd.promi.se 7.0_STABLE NetBSD 7.0_STABLE (GENERIC) #0: Tue Nov  8 22:19:10 CET 2016  feyrer@promise.local:/Users/feyrer/work/NetBSD/cvs/src-7/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
	2) NetBSD vmnetbsd.home.feyrer.net 7.99.42 NetBSD 7.99.42 (GENERIC) #4: Tue Nov  8 13:46:43 CET 2016  feyrer@promise.local:/Volumes/netbsd-src-objdestdir/obj.amd64-Darwin-XXX/sys/arch/amd64/compile/GENERIC amd64
System: NetBSD vmnetbsd.promi.se 7.0_STABLE NetBSD 7.0_STABLE (GENERIC) #0: Tue Nov 8 22:19:10 CET 2016 feyrer@promise.local:/Users/feyrer/work/NetBSD/cvs/src-7/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	On an amd64 system with two CPU cores, running two processes
	that hog CPU time each, one would expect that each process
	runs on one CPU. This is not the case and they both fight for
	one CPU, and the other one is left idle.

	This happens on both NetBSD-current as well as 7.0-STABLE
	with sources as of 2016-11-08.

	The bug was first observed in a Xen environment on Amazon AWS.

>How-To-Repeat:
	1) run "top" and type '1' to see all CPUs 
	2) run two CPU hoggig processes at the same time:
	   loop & loop &
	3) Notice two things in top:
	   a) CPU and WCPU is about 50% for both processes, i.e. none gets
	      the CPU on its own:

  PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
   1222 feyrer    27    0    13M 1344K CPU/1      7:15 54.00% 54.00% sh
    147 feyrer    29    0    13M 1344K RUN/1      7:05 42.97% 42.97% sh

	   b) in the CPU stats on the top, one can see that one CPU is 
	      utilized with 100% user time, the other one with 0%.
	      Expected is 100% on both:

CPU0 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
CPU1 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle

        I've put a screenshot here that shows two VMware VMs
        with 2 CPU cores each, the left one running -current and 
        the right one running 7.0_STABLE, as can be seen from the top:

        http://www.feyrer.de/Misc/priv/bad-scheduling-7.0_STABLE+7.99.42.png

>Fix:
	No idea.

	A workaround exists using psrset(8), see
	http://www.feyrer.de/NetBSD/blog.html/nb_20161105_1754.html

>Unformatted:
 	
 	



Home | Main Index | Thread Index | Old Index