Subject: kern/4365: nice 20 does not cause processes to "run only when nothing else in the system wants to"
To: None <gnats-bugs@gnats.netbsd.org>
From: Mika Nystrom <mika@cs.caltech.edu>
List: netbsd-bugs
Date: 10/27/1997 17:24:51
>Number:         4365
>Category:       kern
>Synopsis:       nice 20 does not cause processes to "run only when nothing else in the system wants to"
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 27 17:35:01 1997
>Last-Modified:
>Originator:     Mika Nystrom
>Organization:
	Asynchronous Systems Architecture Project
	Department of Computer Science
	California Institute of Technology
>Release:        NetBSD 1.3_ALPHA 10/25/97	(and all previous releases)
>Environment:
	
System: NetBSD toulouse-lautrec 1.3_ALPHA NetBSD 1.3_ALPHA (PENTAMATIC) #10: Sun Oct 26 05:18:51 PST 1997 root@saxophone.cs.caltech.edu:/usr/src/sys/arch/i386/compile/PENTAMATIC i386


>Description:
	According to the manual page for renice(8), a nice value of 
(positive) 20 causes processes to "run only when nothing else in the system
wants to".  This is not the case.  A cpu-intensive process at nice 20 will
run at 10-20 %cpu.  In today's environment, nice 20 processes are useful
for long-running background computations on machines that normally serve
as interactive workstations.  Having such computations consistently reduce
system performance by 10-20% is not really acceptable.  Several (derivative?)
OSes have changed the policy from the BSD scheduler---under OSF/1 and Solaris,
for instance, nice 20 processes do appear to stop completely when other things
want to run.

>How-To-Repeat:
	Renice a compute-intensive process to 20 while running another
compute-intensive process at normal priority.  Notice that the nice 20 
process does not entirely yield the CPU.

>Fix:
	Rewrite the scheduler from scratch!

	Just kidding (although it's not a bad idea)---my code is a simple
hack that fixes the problem without too many adverse effects.  I dedicate
the lowest-priority run queue to nice 20 processes.  Since these processes
are not interactive, the fact that I basically disable the scheduler for 
them is not a big deal---they seem to get an about equal share of the CPU,
and real-time response is not an important issue, so...

In /sys/kern/kern_synch.c:

*** kern_synch.c.orig   Fri Oct 10 05:24:57 1997
--- kern_synch.c        Mon Oct 27 17:00:27 1997
***************
*** 683,689 ****
--- 683,694 ----
        register unsigned int newpriority;
  
        newpriority = PUSER + p->p_estcpu / 4 + 2 * (p->p_nice - NZERO);
+ #ifdef HARDNICE
+       newpriority = min(newpriority, MAXPRI-PPQ-1);
+       if (p->p_nice == (PRIO_MAX + NZERO) ) newpriority = MAXPRI;
+ #else
        newpriority = min(newpriority, MAXPRI);
+ #endif
        p->p_usrpri = newpriority;
        if (newpriority < curpriority)
                need_resched();

adding this patch and config'ing the kernel with 

options HARDNICE

in the kernel configuration file has the desired result of suspending
nice 20 processes entirely when other things want to run.

I wasn't really sure of where the queues begin and end, that's why
I subtracted 1 extra in the min...
>Audit-Trail:
>Unformatted: