tech-kern: Re: The thorpej

Subject: Re: The thorpej_scsipi branch
To: Manuel Bouyer <bouyer@antioche.lip6.fr>
From: Matthew Jacob <mjacob@feral.com>
List: tech-kern
Date: 12/29/2000 14:37:06
> > > 
> > > What Jim Kahn recommends it to run until you get QFULL, set that as
> > > the limit, then after 1m clear the limit. 
> > 
> > Yeah, I recommended something like Jim's idea a while back- I think I said 30
> > seconds. It adds a bit of hysteresis to Bob Snively's way of doing stuff.
> 
> Maybe we can just clear the limit when the number of commands in the running
> queue is below a mark. This mean we need to account the number of
> commands in the target's queue; and a way to dynamically set this limit.

And we come back around the merry-go-round again.......


I'll pull in some mail from Sean:

From smd@ebone.net Fri Dec 29 14:36:09 2000
Date: 25 Oct 2000 01:12:54 +0200
From: Sean Doran <smd@ebone.net>
To: mjacob@feral.com
Subject: Re: siop(4) and tagged queuing

Matthew Jacob <mjacob@feral.com> writes:

> 1. Queue the maximum you can until you start getting QUEUE FULL
> 2. Back off the local limit to this value minus one.
> 3. Wait until the next idle period, where idle period is 'no commands running
> for at least 100 ms.' and increment the QUEUE FULL limit by one.
> 4. Repeat #3 until you get a QUEUE FULL and go back to
> #1.

Hm, fine for random, bursty FS activity, but there are two ugly situations:
        - synchronization with 3 -- heavy traffic to/from disk
          in cycles of 100ms can have a queue full happen
          at every cycle; it is worse when the
          synchronization is at a multiple of 100ms and you see
          multiple QUEUE FULL messages
        - long-term blasting at a disk may result in
          2. backing off to 0, which is probably not desirable

The first is only relevant if there is a system
performance penalty caused by the QUEUE FULL processing,
such that one wants to avoid seeing a QUEUE FULL message
as much as possible.

The second is a chronic overwhelming of a scarce resource,
requiring a control law to keep the bottleneck from saturating.

How about a la RFC 2001 instead?  The first problem
(reacting to congestion is expensive) is what the TCP
control law is all about.
http://www.ietf.org/rfc/rfc2001.txt?number=2001
You get an explicit congestion notification (QUEUE FULL)
rather than an implicit one (drop), but other than that,
it strikes me as the same problem -- maximizing bottleneck
bandwidth without witnessing congestion collapse or
oscillations around there.    

WRT the second problem, Van Jacobson likes to talk about
the flywheel control law when discussing TCP and how to
keep from having a long-term "standing" queue in front of
a bottleneck by deliberately signalling (explicitly or
implicitly) in the event of _incipient_ congestion.  This
strikes me as a good way of avoiding the
disk-is-totally-saturated problem, whether that results in
alot of QUEUE FULLs or merely in long wait times while the
disk goes through the queue-of-tags.  (Incidentally that's
also observable with a really large buffer cache as one
does something along the lines of creating a very large
file -- the driver is faced with a queue in front of it).

        Sean.