Subject: Re: nore on disk stats
To: Jason Thorpe <thorpej@nas.nasa.gov>
From: Chris G Demetriou <Chris_G_Demetriou@BALVENIE.PDL.CS.CMU.EDU>
List: tech-kern
Date: 11/09/1995 21:28:22
> With a suggestion from cgd, I have a mechanism that should pretty 
> accurately calculate the amount of time a disk is busy, using 
> timestamps.  My design has changed a little to adapt to implementation 
> details, but is more-or-less the same.  In particular, functions were added:
> 
> 	struct disk *disk_getfirst __P((void));
> 		Returns first disk in disklist.
> 
> 	struct disk *disk_getnext __P((struct disk *));
> 		Returns next disk in disklist.

"why bother"?  Why not just let the users of the code access the list
structures themselves?

If you Really Really Really want these, at least make them macros (not
functions)...


> Now, the question I have is, what units should data transfer rates be 
> in?  As Mike Hibler pointed out to me, counting bytes/words/longs or even 
> blocks could potentially wrap the counter very quickly on a busy 
> fileserver, so it seems as if attempting to do so over the long-haul is 
> dubious.

Quads are your friends.  I sincerely doubt that you'll wrap a quad
counter "very quickly" (or even "at all" 8-)...

I'd say: count in bytes.  it's the only thing you can count on.
failing that, count in blocks, where 'blocks' are the device block
size...

speaking of which, the disk structure should contain the device block
size.

oh yeah, and in the long term, assuming in other code that the 'device
block size' (i.e. DEV_BSIZE) is 512 bytes is annoying...


> My idea is to add another argument to disk_unbusy(), that being the 
> number of bytes that was use transfered.  When the busy count drops to 
> zero, the average transfer rate can be calculated and merged with the 
> running-average.

It's not inconceivable that, for long periods of time, the busy count
won't drop to zero.  This is especially true of devices intended for
high-use situations, like 'ccd' units.  (That's a problem, actually,
with adding time the way we talked about it...)

On possible way to do it is to do something like:

	busy:
		increment busy count
		if busy count == 1
			store start time

	unbusy:
		decrement busy count
		get end time
		add diff between start time and end time to total time
		replace start time with end time.

(i'm pretty sure that that's an accurate calculation...  convince yourself.)


> Anyhow, I'd like to hear any comments or suggestions, particularly, what 
> the most useful "size unit per time unit" combination would be.

"why bother"?  You're storing time, in some unit, and you're storing
some unit of "data transferred."

In my opinion, the only thing calculating averages should be user-land
programs...



chris