Subject: Re: PPC assembly
To: None <port-macppc@netbsd.org, andy@softbook.com>
From: Wolfgang Solfrank <ws@tools.de>
List: port-macppc
Date: 05/08/2000 20:34:21
Hi,

> This snippet is from ofwboot: Locore.c
> Can somebody kindly explain what's going on here?
> I understand what the instructions do but <why> this is done ?
> Below is the code and  my attempt to decipher it - along with the questions

Ok, let's give it a try (some of this may be a repeat of someone
else's explanation):

> Sorry if some of them are clearly in RTFM category

Probably :-).

> asm("
>         .text  ; what does the dot here mean?

It has already be explained that the "." introduces pseudo-opcodes,
while real code doesn't have it.  The ".text" means that the following
should go into the text section of the object file.

>         .globl  _start ;what does .globl  mean ?

This means that "_start" should be a globally known symbol.

> _start:
>         li      8,0  	; load zero into register r8 . Why ?

Would you be surprised if the were used later?

>         li      9,0x100  ; load 0x100 into r9 Why  ?

See above.

>         mtctr   	; copy the contents of r9( 0x100 in this case) into
> the count register .Why

(Note that there is a "9" missing here.  I.e. the line actually reads
	mtctr	9
)  We want to have 256 in the count register for later.  Disturbing,
isn't it?

> 1:

Note the "1:" here which introduces a local label named "1".  Those
labels live only between real labels (like "_start" above) and can
be reused whereever you like.  For their reference see below.

>         dcbf    0,8
>  ; data cache block flush - the effective addr in this case = r0 | 0 + r8 -
> what's in r0? why is this done?

R0 doesn't matter.  The description "RA|0" in the ppc manual means that
the instruction takes the value in RA if A isn't 0 and a literal 0 otherwise.
In effect we are flushing the cache line r8 is pointing to.

> 
>         icbi    0,8 	;instruction cache block invalidate - the EA is
> calculated as above - again why?

See above.  Here we invalidate the same cache line in the instruction
cache.

>         addi    8,8,0x20 ;in this case :r8  = r8+0x20 = 0x20. Correct? Why?

Because a cache line happens to be 32 bytes long.  We want to flush
all of both caches with this code.

>         bdnz    1b	; -- decrement Count Register and branch to
> location 1b if CTR is not zero. why?? What's at location 1b?

Remember that we put 256 into the count register previously.  "1b" is
a notion for the assembler which means a local label "1" in the backward
direction.  I.e. the assembler looks for the last "1:" previous to this
place to find the location of the branch target.  All in all we loop
over the cache flushing instructions for 256 cache lines with 32 bytes
per cache line, and thus we flush 8k instruction and 8k data caches here.

>         sync ; synchronize

This will wait till all bus activity intiated before this place is
finished before initiating anything else.

>         isync ; instruction synchronize

This will flush the prefetch queue of the processor.

>         lis     1,stack@ha 	 ; r1 = stack<<16 -- what does @ha mean?

Actually, r1 = &stack >> 16.  I.e., we load the high 16 bits of the
stack address (actaully not quire right, see below) into r1.
We cannot load all 32 bits of the address at once, since all instructions
on the ppc are 32 bits in length, and  thus there is no room for
a 32 bit constant.

The "@ha" suffix to any value means that we want the high part (meaning
16 bits) of the value, adjusted so that we can later add the low
part (again 16 bits) to it.  Since the processor will sign extend
a 16 bit constant when adding it, the "@ha" part of a value might be
one larger than the actual highorder 16 bits of the value to be loaded.

>         addi    1,1,stack@l 	; r1 = r1 | 0 +stack - what does @l mean ?

The "@l" suffix means to take the low 16 bits of a value.

E.g. if we want to load the value 0x18000 into a register, we would
first load 0x18000@ha, which would get translated to 0x20000, and then
add 0x18000@l which gets 0x8000, but since this is sign extended to
32 bits, we actually add 0xffff8000, and that together with the
0x20000 would give 0x18000, as we originally wanted.

>         addi    1,1,4096  	 ; r1  = r1 | 0 +4096 -- ditto
> 			what's the effect of this  ? why is it done?

Above, we had loaded the bottom address of the stack, and we are now
adding 4k to this to get at the top of it.

>         mfmsr   8 	;r8  = mfmsr (machine state reg)

Saving the MSR.

>         li      0,0	; r0 = 0 ?

Yes.

>         mtmsr   0	; msr = r0 ( or zero since we just moved 0 into r0)

Note that the 0 here actually means r0, but that doesn't matter.
We disable anything in the processor, i.e. no fpu, no mmu, no interrupts
etc.

>         isync 	' instruction sync

Again we have to wait till anything above is done, before we can go on.

>     'why all this manipulation with the instruction BAT upper registrers?
>         mtibatu 0,0
>         mtibatu 1,0
>         mtibatu 2,0
>         mtibatu 3,0
>         mtdbatu 0,0
>         mtdbatu 1,0
>         mtdbatu 2,0
>         mtdbatu 3,0

The BAT registers on the ppc are pretty strange things.  What matters
here is that you cannot have overlapping mappings from different BAT
registers, _even while the mmu is disabled_!  Since the valid bit of
the BAT registers is in the upper BAT register, we simply put the 0
from r0 into them, disabling all of them, so they are in a known
state.

>         li      9,0x12          /* BATL(0, BAT_M) */
>         mtibatl 0,9
>         mtdbatl 0,9
>         li      9,0x1ffe        /* BATU(0) */
>         mtibatu 0,9
>         mtdbatu 0,9
>         isync

Now we set up the first BAT to map the low 256 MB with a 1:1
virtual to physical address mapping in the kernel.

>         mtmsr   8
>         isync

Here we put back the msr value we saved above.  Granted, we should
probably enter a known value into msr instead of relying on the
firmware to have put something reasonable into it.

>         b       startup ; well this one i understand:-))
> ");

Hope it helps.

Ciao,
Wolfgang
-- 
ws@TooLs.DE     Wolfgang Solfrank, TooLs GmbH 	+49-228-985800