port-sparc: Re: Rehash: XFree86 Compiled on NetBSD/Sparc

Subject: Re: Rehash: XFree86 Compiled on NetBSD/Sparc
To: NetBSD/sparc Discussion List <port-sparc@netbsd.org>
From: Greg A. Woods <woods@weird.com>
List: port-sparc
Date: 08/13/2002 19:40:50
[ On Tuesday, August 13, 2002 at 23:49:58 (+0200), der Mouse wrote: ]
> Subject: Re: Rehash: XFree86 Compiled on NetBSD/Sparc
>
> Indeed?  So what's the kernel-provided API to use the line-drawing
> hardware on a cg6?  Or the blitter?

Damned if I know.  "Not my department!"  :-)

(though I'm sure I could design a good one given the hardware
documentation and some information about how an Xserver might want to
interact with such hardware)

> When it comes to talking to the framebuffer, I disagree with you.  I do
> not believe it would be acceptable to pay a syscall price for every
> framebuffer operation.

I don't know which side of the board you're looking at here, but I
didn't say anything about requiring syscalls for all operations...  :-)

With a proper device driver interface there's no reason why the
high-volume I/O's can't be done through mapped addresses and that DMA
can't happen to/from that memory directly without needing syscalls for
every operation.  The point of using a proper device driver instead of
just opening all/much/most of the hardware and memory up to the Xserver
is to provide proper control over what process(es) get access to the
device registers and whatever memory space is used for DMA operations.
It's ludicrous and unnecessary to have to risk _everything_ to the
Xserver, even in a relatively isolated environment on a well protected
workstation.

It's all the better of course if some abstraction can be done to make
all similar graphics chips use the same device driver API so that one
Xserver implementation can drive as many different underlying bits of
hardware as possible.  I'm sure enough modern graphics chipsets are
similar enough that at least some abstraction can be done with
absolutely no loss of performance, and perhaps even some operations can
be batched up and done all within the kernel, avoiding even some context
switches.

With the right API between the graphics engine and the application an
awful lot can be done without a whole lot of data moving around.  I
implemented drivers for an ISA-based Matrox card once upon a time quite
a long while ago, and that card used only four 8-bit I/O registers for
everything.  However it could be set up to do some pretty amazing things
and to do them at what were at the time some pretty amazing speeds.  If
I remember correctly the major innovation in my new driver was taking an
entire arbitrary sized buffer of operations from a write() call by the
application and then spitting them out to the card through the 4-byte
register window as fast as it could take them, and then doing the
opposite for reading stuff back from the card (i.e. buffering in the
driver as much data as was available from the card's reply to the recent
operations).  IIRC the original Matrox-supplied driver only allowed the
application to read or write four bytes at a time.  Interestingly I
didn't have to learn anything about computer graphics in order to
implement that much more efficient driver.  It was merely a matter of
using a more efficient communications protocol, and as it happened I was
the comms protocol expert on that project!  ;-)

-- 
								Greg A. Woods

+1 416 218-0098;            <g.a.woods@ieee.org>;           <woods@robohack.ca>
Planix, Inc. <woods@planix.com>; VE3TCP; Secrets of the Weird <woods@weird.com>