Subject: Re: interrupting parallel port lossage
To: None <bde@zeta.org.au>
From: Terry Moore <tmm@databook.com>
List: port-i386
Date: 03/23/1995 13:00:45
> >			    A  B  C  D
> >			     ________                    _ _ _ _ _
> >	DATA    ------------<________>------S S---------<_ _ _ _ _
> >		_______________    _________   ____________
> >	*STROBE                \__/         S S            \_
> >				  __________   _________
> >	BUSY    _________________/          S S         \________
> >		____________________________   ____         ____
> >	*ACK                                S S    \_______/
> >						   E    F   G
> 
> >A->B == DATA out to *STROBE low time = 0.5 us min.
> >B->C == *STROBE low to *STROBE high time = 0.5 us min.
> >C->D == *STROBE high to DATA tri-state == 0.5 us min.
> 
> Do you think the software needs to delay for C->D?  I don't think so.

Databook makes parallel port i/f devices.  Some parallel ports are
really slow!  If there's lots of FCC filtering, C->D (as programmed by
the software) may need to be lots longer in order for C->D (as seen
by the printer) to be long enough.  

The rise and fall times are not equal.  The strobe fall time is usually
pretty fast.  The rise time is often very slow.  The strobe signal
is an open collector output (in the canonical printer i/f, as published
in the original PC Tech Ref); this means that it actively drives the wire
low, but depends on a pull-up resistor to pull the line high.  The
pull-up resistor is typically 4.7K.  One has an uncontrolled
amount of capacitance on the printer port cable.  It is not unusual to
see rise times from 0 to 1.5V (the typical TTL positive going threshhold)
of 0.5 us.  This means that software must allow at least a microsecond
from C to D: 0.5us for the rise time of strobe at the printer, and 0.5
us of hold time for the spec.  Note that 4.7K * 100pF gives a time 
constant of about 470ns.  A time constant of 470ns means the signal
will move from 0->63% in 470ns (with an RC pull-up).  We can model everything
from 0->63% as a ramp; so with only 100pF of external capacitance, we'll
need about 220ns just for RC delay time.  Furthermore, 100pf is for a
pretty good interface and cable combination.

On most ports, the data bus delays are also asymetrical.  TTL outputs
drive "high" alot less hard than they drive "low".  The original PC
used an LS374, which can only source 2.6mA driving high (when Vout = 3V).
TTL drivers, when shorted, saturate; they act like constant current sources.
The original PC used 2.2nF caps on each data line to ground, to slow down
the edge rates.  The rise time to 1.5V using dv = i/c dt is (worst
case) 1.2 us.  Or course, the saturation current is *usually* greater,
so *usually* you don't need to wait this long.

Worse:  OEM printer ports frequently use even more capacitance.  Although
these ports may have CMOS output drivers (which makes the edges faster, 
because the PMOS channel is not a constant current source and can move
lots of current at the beginning of the edge), FCC considerations usually
lead them to really slow things down by making the caps bigger.

This time must be added to A->B, because the spec applies at the printer.

To give you an idea of how this works on the PC/XT:

	1)  Software outputs the data byte.
	2)  Software waits for BUSY to be low.
	3)  Software drives strobe low.
	4)  Software drives strobe high.

I happen to have an XT tech ref and an AT tech ref, which include
BIOS listings.

I counted clock cycles.  Just to fetch the instructions for the path
from 1 to 3 on a 4.7MHz PC/XT takes 56 clock cycles (14 bytes of code
times 4 clocks per byte); the code starts right after a jump, so the
prefetch queue is not doing much for us.  That's 12 microseconds of
setup time from data valid to strobe, not counting execution clock
cycles.  This timing is, unfortunately, the canonical timing for
BIOSes; anybody writing a clean room spec is gonna say "make sure the
setup time is at least 10 microseconds").  I've checked the AT BIOS
listing, too; sure enough, the delay is at least as long (lots more 
instructions in the path).  Curiously, they disable interrupts during
the strobe-low time (they want to guarantee 1us <= strobe <= 5us).

Summry: to work with *any* printer port, you need a long data delay.
It may be that you can get this by changing the invariant and the
order in which things are done in the printer driver.  You can ALWAYs
pre-load the next character at the start of lptintr();  so the
changes might want to be as shown below.

Note that I'm starting from lpt.c as patched by the netbsd 1.0 distribution
plus patches from the InfoMagic BSDisc.  I can also supply the diffs
from lpt.c.orig if needed, but I reckon most people have the 11/23
patch from Chris Demtrious.

--- lpt.c	Sat Feb  4 04:04:31 1995
+++ lpt.c.new	Thu Mar 23 13:06:15 1995
@@ -415,6 +415,9 @@
 		u_char control = sc->sc_control;
 
 		while (sc->sc_count > 0) {
+			/* preload */
+			outb(iobase + lpt_data, *sc->sc_cp);
+			
 			spin = 0;
 			while (NOT_READY() && spin++ < sc->sc_spinmax);
 			if (spin >= sc->sc_spinmax) {
@@ -433,8 +436,8 @@
 				}
 			}
 
-			outb(iobase + lpt_data, *sc->sc_cp++);
 			outb(iobase + lpt_control, control | LPC_STROBE);
+			sc->sc_cp++;
 			sc->sc_count--;
 			outb(iobase + lpt_control, control);
 
@@ -507,14 +510,17 @@
 #endif
 
 	/* is printer online and ready for output */
+	if (sc->sc_count)
+		outb(iobase + lpt_data, *sc->sc_cp);
+
 	if (NOT_READY())
 		return 0;
 
 	if (sc->sc_count) {
 		u_char control = sc->sc_control;
 		/* send char */
-		outb(iobase + lpt_data, *sc->sc_cp++);
 		outb(iobase + lpt_control, control | LPC_STROBE);
+		sc->sc_cp++;
 		sc->sc_count--;
 		outb(iobase + lpt_control, control);
 		sc->sc_state |= LPT_OBUSY;
------------

Note that the theory is that incrementing cp commits us to transmitting the
character, and so nothing different should happen even if we're aborted
at an awkward time. I've changed both the polled and the non-polled code.
I don't have the problem here, but maybe somebody else might like to test
this?

--Terry Moore
Databook Inc.