Subject: Re: com rumblings...
To: Charles M. Hannum <mycroft@MIT.EDU>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-kern
Date: 06/15/2006 14:38:24
Charles M. Hannum wrote:
> On Thu, Jun 15, 2006 at 01:17:06PM -0700, Garrett D'Amore wrote:
>   
>> Wouldn't those 4- or 6- (or more!) port cards generally have working
>> FIFOs on them?
>>     
>
> Sure, but that only helps so much.  At 115200, you're still talking
> about 1000 interrupts/sec (you can't just divide by 16) per port, per
> direction.  And that's assuming no control signals are being toggled.
> On a 386, this added up very quickly.  That's why the "hard" interrupt
> routine had the absolute highest priority, even outside splhigh().  It
> worked, at the time, before a lot of crap was added in the path.  Also,
> even in the mid-90s, people were starting to "overclock" com-like parts
> to 230400 and 460800 bps.
>   

Yes, I know about some of that.  I've been working on a Solaris driver
for 16950s, that uses a 128-byte fifo to help out somewhat.  I've seen
other chips that use DMA to minimize the time spent doing PIO, as well. 
(Alchemy Au15xx cpus can use DMA to get to 4Mbps.)

Anyway, I don't think anything I've done fundamentally changes the
paths, other than possibly requiring an extra indirect memory
reference.  In theory, a good optimizer could even remove that, though
we could need to add some hints about the fact that the bus handles and
tags are not changing so can be safely cached.  (Though, again, a good
optimizer should be able to figure this out since the routines at issue
are leaf routines.)

I'm not sure how good gcc's optimizer is at detecting that:

    struct {
       bus_space_handle_t h;
       bus_space_tag_t   t;
    } s;

    bus_space_write_1(s->t, s->h, OFFSET1, VAL1);
    /* possibly insert code that doesn't modify s */
    bus_space_write_1(s->t, s->h, OFFSET2, VAL2);

both values of s->t and s->h are really the same and could be cached in
registers without having to dereference thru s for each
bus_space_write_1 call.

whether bus_space_xxx are macros or functions probably also plays a role
in this.

    -- Garrett
> Most of the architecture of the driver is based around just getting the
> bytes off the chip as fast as possible.  If you ever get overloaded
> with interrupts, you can end up (even with a FIFO) in a mode where you
> never get more than 16 contiguous bytes -- and at that point pretty
> much any protocol (PPP, SLIP, Zmodem, Kermit, whatever) running over
> the link is just dead in the water.  So the "hard" interrupt was kept
> absolutely minimal (and I came up with some semi-clever things, like
> testing all the control bits with a single test), and I added a whole
> flow control mechanism around the software FIFO to allow selectively
> turning off the interrupt and ensure that we got larger chunks of good
> data.  (AFAICT, I was the first person to specifically identify this
> livelock condition in serial drivers and design around it.  At the time
> we had better serial performance than FreeBSD or the classic SCO "FAS"
> driver.  Unfortunately nobody pays attention to these things any more.)
>   


-- 
Garrett D'Amore, Principal Software Engineer
Tadpole Computer / Computing Technologies Division,
General Dynamics C4 Systems
http://www.tadpolecomputer.com/
Phone: 951 325-2134  Fax: 951 325-2191