Subject: Re: personal impression of issues on netbsd/macppc
To: Michael <macallan18@earthlink.net>
From: Tim Kelly <hockey@dialectronics.com>
List: port-macppc
Date: 11/19/2004 11:56:42
Hi Michael,

(I'm posting this to the list, in case other people will find this
useful.)

As far as using I/O or memory space on PowerPC PCI, it's just easier to
use the memory space directly. Although I/O space will be tagged,
it is not used by Apple, even though their docs say that I/O space is
mapped. The second generation PCI Macs use a different arrangement for
space, as they use PCI-PCI Bridges, instead of PCI-Host Bridges. As a
result, to the best of my knowledge, each bridge will have two address
spaces. With the PCI-Host Bridge bandit and control chips, you'll find
not two but_three_ address spaces (which is given by the ranges property
in OF). Two are memory, one is I/O.

(I'm using Matt Thomas' ofdump utility, but I've had to clean up the
output a lot. The tuplets change depending on whether it is ranges or
assigned-addresses, but the default output is in 16 byte intervals of
memory. I hope to patch it sometime and return the patch to Matt. His
utility is extremely useful.)

ff831568: /bandit@f2000000

<snip>
reg                     f2000000 02000000
#address-cells     00000003
#size-cells          00000002
<snip>
ranges          
          02000000 00000000 f3000000 f30000000 0000000 01000000
          01000000 00000000 00000000 f20000000 00000000 00800000
          02000000 00000000 80000000 80000000 00000000 10000000

There are a couple references that can be used to interpret this. There
is the PCI binding for OF:

http://playground.sun.com/1275/bindings/pci/pci2_1.pdf

And Apple's Technote on this:

http://developer.apple.com/technotes/tn/tn1062.html

The reg property shows bandit starts at 0xf2000000 and has size
0x02000000. This covers the Host side of things.

To understand the breakdown of devices on this PCI device, the keys are
the #address-cells (3) and #size-cells (2). This tells us how to
interpret the ranges property.

The basic breakdown is (left to right)
phys.hi phys.mid phys.lo of the child (3 address cells)
phys.lo of the parent (parent)
size.hi size.lo of the child (2 size cells)

The thing to keep in mind is that there is not an exact correlation
between the PCI spec and the Open Firmware implementation. During
discovery process, work with OF and compensate for bugs. Then use that
information to deal with PCI.

The primary focus is the first cell (the far left 32 bit entry, also
known as the phys.hi cell of the child, which is us), and within that
cell we want the leftmost 8 bits (01 and 02). According to the docs, the
bitwise breakdown is

npt0 00ss

n, p, and t are clarifiers dependent on the value of ss. In the case of
ss being 01, this space is I/O space, and in the case of 02, the space
is Memory space (page 5 of the PCI 2.1 Binding).

The rightmost two cells are the size.hi and size.lo values, in 64 bit.
Since the CPU is only 32 bit, we're only interested in the size.lo
value. You can see from the 0200000 that the 0xf3000000 is 16M of memory
space, and so is the 0x80000000 but is 256M big, but the 0xf2000000 is
tagged I/O space (8M). However, I believe Apple uses this for
configuration.

Here is the I/O controller (gc) for the motherboard devices (7300, only
one bandit):

ff832758: /bandit@f2000000/gc@10

name                    676300.. ........ ........ ........   "gc"
device_type          6462646d 6100.... ........ ........   "dbdma"
<snip>
reg               00008000 00000000 00000000 00000000 00000000
                    02008010 00000000 f3000000 00000000 00100000 
assigned-addresses
                    82008010 00000000 f3000000 00000000 00100000

It is mapped to 0xf3000000, the higher memory space of bandit. The 8 in
the assigned-addresses far left most cell (the phys.hi cell) indicates
that the n bit was set, which means that the addresses are prefetchable
and in 32 bit address space (PCI allows for 64 bit address space using
offsets from another 32 bit reference). So one configuration register,
one memory space register in the base registers, and only the memory
space register was actually mapped.

PCI cards are mapped at 0x80000000 on Old World Macs with one bandit
chip. With two, I think the second PCI slots are mapped at 0x90000000. 

ff83f5d0: /bandit@f2000000/pci128a,3@f

reg                     
                 00007800 00000000 00000000 00000000 00000000 
                 01007810 00000000 00000000 00000000 00000100
                 02007814 00000000 00000000 00000000 00001000 
                 02007830 00000000 00000000 00000000 00010000  

assigned-addresses                
                 81007810 00000000 00000400 00000000 00000100           
                 82007814 00000000 80810000 00000000 00001000
                 82007830 00000000 80800000 00000000 00010000

Four base registers, the first is configuration, the second is I/O,
and the remaining two are memory. There is a 256 byte (0x100) I/0 space
on this card at 0x400, a 64k memory space at 0x80810000, and a
1M memory space at 0x80800000 (it's an ethernet card). Don't
use the I/O, use the memory space.

/chaos, the on-board video controller for many Old World Macs,
has I/O space at 0xf0000000 and memory space at
0xf100000 and 0x9000000. I believe the first space is used for the
video-in controller, which is not present on my 7300.

ff83dcd0: /chaos@f0000000

name                    6368616f 7300.... ........ ........   "chaos"
device_type             76636900 ........ ........ ........   "vci"
model                   4141504c 2c333433 53313135 3500....  
"AAPL,343S1155" 
reg                     f0000000 0200000
#address-cells     00000003
#size-cells           00000002
<snip>
ranges            
              02000000 00000000 f1000000 f1000000   00000000 01000000 
              01000000 00000000 00000000 f0000000  00000000 00800000
              02000000 00000000 90000000 90000000 00000000 10000000

bus-range              00000001 00000001 ........

The scheme is very similar to bandit. 16M memory space at 0xf1000000, 8M
I/O (configuration) space at 0xf0000000, and 256M memory space at
0x90000000.

The "vci" device_type will act much like a PCI-Host Bridge device, but I
included the bus-range property to show a problem with /chaos. Its PCI
tag is bus 1, instead of bus 0, but it gets really, really cranky when
you use that bus tag. It'll claim there is a device at every single PCI
location (or at least it did with the OpenBSD code). Instead, it's a
good idea to know exactly what location you want to probe and probe only
that one. On two bandit Macs, it'll get bus 2. One bug I found in
OpenBSD's code is that they weren't properly obtaining the bus tag and
just assumed everything was behind PCI-PCI bridges instead of PCI-Host
Bridges like chaos and bandit.

http://www.dialectronics.com/OldWorldMacs/code/vci_addr_fixup.c
http://www.dialectronics.com/OldWorldMacs/code/vci.c

Now, control is the video-out controller:

ff83f9d8: /chaos@f0000000/control@15800,0,0

reg          
          00015800 00000000 00000000 00000000 00000000 
          02015818 00000000 00000000 00000000 04000000 
          02015814 00000000 00000000 00000000 00001000

assigned-addresses      
          82015814 00000000 94000000 00000000 00001000 
          82015818 00000000 90000000 00000000 04000000

The assigned addresses (0x90000000 and 0x94000000) are the vga and mmio
addressing registers for /chaos/control. However, they are reversed in
location from ATI. Also, don't probe the first register. It'll hang the
chip. Also note that there are no I/O space registers. This is probably
an important aspect, especially if one is trying to do I/O mapped VGA.

http://www.dialectronics.com/OldWorldMacs/code/vgafb_control.c

The first for loop in vga_pci_probe adds 4 bytes to the base register,
so that probing begins at 0x14. The first register appears to be
used for configuring, and writing -1 to it blows that away. 

About halfway you'll see the probing for size. The code is from OpenBSD,
but does not work as is in their tree. They made a mistake in the logic
and were assuming that the first address space would be smaller than the
second. Once I found this problem, I was able to enable OF based frame
buffering, although the horizontal sync was off. I was able to track
that into an assumption they made about what to do in case "width" was
not a valid word in OF.

You'll notice that few of the spaces are ever tagged I/O. So while
Apple says there is I/O space, it is smoke and mirrors when it comes to
their own chips. From what I have gathered in reading threads on mailing
lists, things work better on PCI cards if you skip the I/O portion and
go straight to memory. Much faster, plus better compatibility with the
architecture. There are some built in tags that should use memory space
instead of I/O, and it'd be a good idea to stay consistent with those
for the macppc port.

tim