tech-kern: Re: I/O priorities

Subject: Re: I/O priorities
To: None <tech-kern@netbsd.org>
From: Steven J. Dovich <dovich@lethe.tiac.net>
List: tech-kern
Date: 06/21/2002 11:31:12
On Fri, 21 Jun 2002 10:00:56 EDT, John Franklin wrote:
> On Fri, Jun 21, 2002 at 03:31:19PM +0200, Olaf Seibert wrote:
> > In Operating Systems lectures at the university they taught me about the
> > "elevator algorithm".
> > ...
> > I cannot imagine why this basic textbook stuff would not already be
> > implemented...
>
> The requests queued in (c) of your example are coming in faster than the
> disk can service them, and because they're all read/write-block-n+1,
> they're all in front of the current position.  None of them are eligible
> to be serviced in (e).  So, the request in (b) is starved until the very
> large write in (c) is completed.
>
> This would be fine if some process (say, an ls) wasn't blocking on (b).
> (b) AIUI is a swap I/O that's blocking the entire system.
>
> The solution we seek is to elevate the priority of (b) so the system
> isn't blocked.

The root problem is that the scheduling is optimizing for the
mechanical limitations that affect I/O performance, and is not
accounting for bandwidth congestion. Current strategies, particularly
those from basic textbooks, do not address this area as the pain
of a system that is thrashing will eventually drive enough activity
from the system to reduce the arrival rate to some form of equilibrium.
John describes the general profile of a workload where the current
algorithm is known to be deficient.

Introducing bandwidth congestion management will permit kernel
detection of what will become end-user pain, and take appropriate
steps to limit the arrival rate before the system stalls or thrashes.
This must limit the optimization for mechanical effects to a local
optimization that is only given limited visibility over time.
Requests need to be able to be aged, and producer processes throttled,
otherwise users (as we have seen in this thread) will be driven to
throttle someone over the performance of their system.

We won't get a smart algorithm, until the kernel can assess the
available capacity, and determine which producers are getting more
than their fair-share of bandwidth. A dumb algorithm which can't
or won't assess those metrics may be sufficient, but will likely
move the problem to another (less understood) workload profile.

/sjd