Subject: Re: IO Congestion Control
To: None <tls@rek.tjls.com>
From: Steven M. Bellovin <smb@cs.columbia.edu>
List: tech-kern
Date: 09/12/2006 09:46:24
On Tue, 12 Sep 2006 01:49:23 -0400, Thor Lancelot Simon <tls@rek.tjls.com>
wrote:

> On Mon, Sep 11, 2006 at 08:26:37PM -0700, Matt Thomas wrote:
> > 
> > There's a problem with applying this to disks...  Unlike a network,
> > if I issuing sequential writes, if someone issue a write to a far lba,
> > that's going to screw my latency which I have to seek back.  That
> > wouldn't happen in a network where I communicating to a local host
> > on my LAN and then send a packet over the internet.
> 
> I think what you want to do -- and I think this was Bill Sommerfeld's
> original idea for the right metric here, though it's been a while and
> my memory is fuzzy -- is penalize those who write when the average
> latency, across all writers, is increasing.  And you probably want to
> scale this to the number of writes, to obtain something like the
> "average write cost" per writer.  The processes with the highest
> average write cost are the ones to put to sleep if you want the
> fairest scheduling across processes.

It's not clear to me that taking the average is worthwhile -- I suspect
that at times that it matters, the statistical variant I suggested will
work.  But basically, we all agree -- don't worry about the underlying
device or the file system layout properties; just look for writes that are
starting to take longer than they "should" when things are busy.
> 
> Note that when the disk isn't busy, most processes' write cost will
> stay within a fairly narrow band.  But when it _is_ busy, the write
> costs will separate, because anyone who happens to always force a
> seek when he writes will be penalized accordingly -- and much more
> heavily than the other processes whose sequential write streams were
> seeked away-from and back-to, because _most_ (though not all) of their
> I/O will be sequential.
> 


		--Steven M. Bellovin, http://www.cs.columbia.edu/~smb