Subject: Re: linear and non-cacheable mapping vs bwx
To: None <M.Drochner@fz-juelich.de>
From: Chris G. Demetriou <cgd@netbsd.org>
List: port-alpha
Date: 09/06/1999 01:01:09
Matthias Drochner <drochner@zel459.zel.kfa-juelich.de> writes:
> cgd@netbsd.org said:
> > ISTR that the bit's named "cacheable" because caching is actually an
> > option on the PC (and on device memory for (some, at least) TC
> > alphas). 
> 
> The BUS_SPACE_MAP_CACHEABLE - it is at least wrong that the mi PCI
> code translates the PCI_MAPREG_MEM_CACHEABLE bit (which should
> be better called _PREFETCHABLE) into a BUS_SPACE_MAP_CACHEABLE.

yeah, this always bugged me, but never enough to do much about it.

really, as far as PCI goes, to do it right you need at least "none,"
prefetchable, and cacheable, each of which is is strictly less
tolerant of prefetching/caching than the previous.  however, just
about nothing wants "cacheable" (really, the only thing that i can
think would want it are real live memory cards), so it never bugged me
to overload it.

(note, of course, that there is nothing there about write buffering.
i'm assuming the driver controls that explicitly with barrier
calls... or should.)


> Wouldn't make a difference on the alpha, but cause serious damage
> on PCs. Obviously there are not many PCI devices which set this bit.

No, especially not devices which the kernel cares much about.

In my experience, this is mostly restricted to frame buffers, which
the kernel doesn't do too much with (esp. on PCs), and cards with
bugs.  8-)


> > that driver should
> > be using the bus_space macros/fns and not linear mapping
> 
> OK, in the if_ti driver this might be acceptable. There in only
> one place where it might be significant: Some descriptors have
> to be placed into the "shared memory" in one chip version and
> into host memory in another. Perhaps some indirection could be
> done which still allows the linear mapping if available.
> Gigabit Ethernet goes quite to the limit of bus throughput, so
> every wasted cycle will degrade performance.

I think i'd buy this, but...


> (One could also argue "A box before bwx is too slow for Gigabit
> Ethernet anyway", but this might be a bit too extreme. I'm eg
> happily using FDDI in a pmax although it can't use its
> bandwidth by far...)

This is a problem.  you should at least try to provide a fallback
position.  how you do that without slowing down the code is a good
question.

Note, of course, that you still need to be using the barrier ops as
appropriate, even if you are using linear mapping.


> > As a
> > short-term hack, well, hack your local kernel to allow
> > non-prefetchable ("non-cacheable") mappings to use linear mapping for
> > BWX.
> 
> I'd say we should really introduce a BUS_SPACE_MAP_PREFETCHABLE
> flag and pass this from the PCI layer to the bus layer.
> The BUS_SPACE_MAP_CACHEABLE which might turn on a CPU cache
> is just too wrong.

I don't think I can argue with that... however, it means that
BUS_SPACE_MAP_CACHEABLE will just about never be used.  (I can't say
whether that's a good thing or a bad thing.)


Looking at every single use of BUS_SPACE_MAP_CACHEABLE right now, i
don't find _any_ where it's being set that I think are correct.  It's
being set in a few frame buffer drivers (tga and a couple of atari
FBs), and that's it.  In the TGA case, I know it means prefetchable.
I'd suspect that it's wrong to make it cacheable in the atari case as
well, but i'm not familiar enough with the hardware to say for sure
what's necessary/intended to get reasonable performance.


cgd
-- 
Chris Demetriou - cgd@netbsd.org - http://www.netbsd.org/People/Pages/cgd.html
Disclaimer: Not speaking for NetBSD, just expressing my own opinion.