Subject: delay() - some design notes
To: None <mcr@latour.sandelman.ocunix.on.ca>
From: Gordon W. Ross <gwr@mc.com>
List: port-sun3
Date: 12/05/1995 10:48:24
> Date: Mon, 04 Dec 1995 23:42:12 -0500
> From: Michael Richardson <mcr@latour.sandelman.ocunix.on.ca>
> 
>   The comments in the ncr5380 code says that we get screwed because
> delay takes a lot longer than it should.

See other mail about how this can easily be fixed.

>   Looking at the code, I can sort of see why, but not entirely.
>   sun3_startup.c:sun3_verify_hardware() sets cpuspeed=20. Right.
> 
>   At 20Mhz, a CPU cycle takes 50ns. Working things through, I see
> things work out properly. (I wrote three paragraphs and then noticed
> that the subql is taking 8 off each time). 
> 
>   If the overhead is 80 = 10 loops @ 400ns = 4us, isn't a great deal
> of this overhead having to do with the multiplication? Alas, I don't
> think my good old 68000 timing charts will help me here. 
>   The comment indicates that the minimum delay is about 5 us. By the
> time you do the multiplication, manipulate the stack, I can see why.
> How important is this to the scsi code?
> 
>   Instead of multiplying, why not put one loop in another? Hmm. Let's see.
>   e.g: (please excuse my rusty motorola syntax 68k)

The reason for the multiplication is so the loop count can be
computed as a rational multiple of the CPU speed, i.e.:

	loop_count = usecs * speed_factor

where speed_factor is a rational number.  The divisor part
of that rational number is arbitrary (I chose 8) but must
be large enough to allow reasonable accuracy when the above
formula is computed using integer math:  (factored)

	loop_count = (usecs * speed_factor_numerator) /
			speed_factor_denominator

I didn't see any way to have both accurate longer delays, and
low enough overhead to have accurate delays < 5 uS.  That's
why I invented delay2us(), which seems good enough for most
places that need a really short delay (zs and ncr drivers).

Gordon