Subject: RE: Coallesce disk I/O
To: Chris Jepeway <jepeway@blasted-heath.com>
From: Gordon Waidhofer <gww@traakan.com>
List: tech-kern
Date: 01/28/2004 15:58:09
Nice post, Chris.

Capturing data at strategy() time, then
analyzing with simulators 'n' such sounds
pretty good. And reasonably cheap compared
to implementing working prototype. Neat idea.
Do you still have any of those tools around?

I'm surprised by the 5% number. I'd thought
it would be higher. Much much higher. I've
seen coallescing typically turn 2.5 disk I/Os
per file op into an average of 1.2, about half.
Certain loads -- lots of write -- were more like
6:1. That was on a proprietary O/S and, as you
say, NetBSD is already doing a lot before strategy().
In your and Stephan's earlier work what increases
did you see?

Cheers,
	-gww

> -----Original Message-----
> From: tech-kern-owner@NetBSD.org [mailto:tech-kern-owner@NetBSD.org]On
> Behalf Of Chris Jepeway
> Sent: Wednesday, January 28, 2004 10:44 AM
> To: tech-kern@NetBSD.org
> Subject: Re: Coallesce disk I/O
> 
> 
> Well, lemme weigh in, here, if I may.
> 
> I think the VM tricks approach is simple enough that
> it makes sense to use it for exploratory purposes,
> to, as Stephan says, see if there's a dead rabbit
> or gold down at the end of the "how to coalesce
> I/O" hole.  If gold, fooling around with the
> next step of firming up the implementation
> makes sense.  If stinky dead rabbit, well, drop
> it in favor of other improvements that give
> more bang for the buck.
> 
> Anecdotal testing--compiling GENERIC on a SCSI
> disk manufactured in the past 5 yrs--showed that
> 5% of the buffers delivered to strategy were
> coalescable, and that coalescing them improved
> disk throughput by a corresponding 5%.  As for
> execution time, good news/bad news: no extra
> CPU time, neither user nor sys; but no better
> wall-clock time, either.  At least, that's
> what I recall.
> 
> So, what do I think you can tell from this?
> 
>     o  there are some I/O requests that can be
>        coalesced (at one point, it had been
>        put forth that there wouldn't be
>        that many, since UVM/UBC/UFS were
>        pretty smart about read-ahead and
>        write-behind); these are presumed
>        to be meta-data writes, but that
>        was never confirmed
> 
>     o  on an 800MHz AMD Duron, the overhead
>        of coalescing I/O via VM tricks isn't all
>        that much (since there's no change
>        in CPU times for the compiles)
> 
>     o  it'd be easy enough to test how much
>        coalescing I/O would help out w/
>        whatever's going on in -current;
>        if the meta-data theory is correct,
>        maybe the bolus of softdep data
>        could be softened up by coalescing
> 
> I'd jump in and do some tests w/ -current, but my devel/test environments
> are more-or-less toast.  I'm trying to resurrect them, but what with
> this, that, and the other thing, it'll be 2 weeks or so b/4 I
> could run any benchmarks.  Meanwhile, I'd be glad to work w/ anybody
> interested in using the patches at http://www.blasted-heath.com/nbsd/cluster/
> to get solid numbers from -current.  If the patches, for instance, no
> longer apply cleanly, I think I've got enough infrastructure left to
> cook up a set that would work.
> 
> Any takers?
> 
> Chris.
>