Subject: Re: HEADS UP: gdamore-uart branch
To: der Mouse <mouse@Rodents.Montreal.QC.CA>
From: Garrett D'Amore <garrett_damore@tadpole.com>
List: tech-kern
Date: 07/10/2006 15:29:52
der Mouse wrote:
>>> [testing gdamore-uart stuff]
>>>       
>> What you are testing is serial performance.
>>     
>
> Inbound?  Outbound?  Both?  Any particular performance metrics?  Max
> speed sustainable?  Interrupt CPU percentage for a given speed?  Error
> rate for a given speed?
>   

Both, and any metrics you're willing to test would be helpful.

> AIUI the code change in question would basically slow down all driver
> access to the chip, so any of the above would be worth checking - is
> this at least approximately correct?
>   

Correct.  Basically, it turns default code from

    bus_space_handle_t   h = sc->sc_ioh;
    bus_space_tag_t  t = sc->sc_iot;
    bus_space_read_1(tag, h, addr);

into

    bus_space_read_1(sc->sc_tag, sc->sc_ioh, addr);

For multiple access of bus_space_read_1 (or write_1) this means that the
tag and handle may have to be gotten from the struct sc (i.e. multiple
memory dereferences) instead of a cached local variable.

>   
>> If you have a way to measure that directly, it would be optimal.
>>     
>
> Well, if there's no output buffer to speak of, I could just pump data
> out (to nothing) and crank up the baud rate until interrupt CPU time
> hits 100%.
>   

That sounds like it might be interesting.  I hadn't thought of that.
>   
>> If you'd like, I can post a kernel on ftp.netbsd.org.
>>     
>
> That would be very helpful - but don't bother until I've verified that
> the board I have works.  (I'm in a time crunch right now between work
> running a bit late and a friend dropping by, so I can't check now.)
>   

Okay.  Let me know.  Worst case is that I'll go ahead and commit, and we
later find that it makes enough performance difference that I have to
make further corrective action.  (Charles has suggested one way to help
the optimizer out, which I might try.  But I'm not 100% certain it will
help, because it may involve converting several of these accesses into a
single function call.)

Does anyone know if it makes sense to mark a function both pure and
inline in gcc?

    -- Garrett
> /~\ The ASCII				der Mouse
> \ / Ribbon Campaign
>  X  Against HTML	       mouse@rodents.montreal.qc.ca
> / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
>   


-- 
Garrett D'Amore, Principal Software Engineer
Tadpole Computer / Computing Technologies Division,
General Dynamics C4 Systems
http://www.tadpolecomputer.com/
Phone: 951 325-2134  Fax: 951 325-2191