Subject: Re: More on rcons performance woes: pmax profiling
To: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-kern
Date: 03/02/1999 15:41:51
In message <199903021615.QAA05844@orchard.arlington.ma.us>Bill Sommerfeld write

>Note that, in particular, xterm does jump scroll by default (the
>option is selectable in the menu popped up by control-middle), which
>is likely one of the reasons why it's faster than rcons, and makes
>simple xterm-vs-rcons benchmarks .. suspect.

Sure. 

that's why I also posted results from Charles' `output' test, which
paints 80*23+79 chars, then has cursor-positioning sequences to home
the cursor, such that it never scrolls.

That *is* a fair apples-to-apples comparison, and rcons blitting is
still abysmal.


One issue that R.C. and I have discussed is interaction between I/O
bus acceses and hw caches: either IO bus caches or CPu writebuffer
caches.  Multias have a half-duplex merge buffer in front of the PCI
bus; doing single-word, read+modify+write blits means you have to turn
that bridge around for each word blitted.  mips CPUs have a
writebuffer for all memory accesses, but doing an uncached read forces
a drain of that write-buffer.

So (ignoring hw acceleration) both systems should get a significant
win from doing blits in 32-byte or 64-byte chunks-- both for X and in
the kernel.

I have no idea whether similar considerations apply to, say, sun4c
framebuffers. Anyone know?