Subject: Re: PPC assembly
To: Andy <andy@softbook.com>
From: Bill Studenmund <wrstuden@zembu.com>
List: port-macppc
Date: 05/08/2000 11:52:52
On Fri, 5 May 2000, Andy wrote:

> This snippet is from ofwboot: Locore.c
> Can somebody kindly explain what's going on here?
> I understand what the instructions do but <why> this is done ?

For some of the why's you're going to need to understand how the hardware
works. :-) You're looking at fairly low-level OS initialization code. :-)

> Below is the code and  my attempt to decipher it - along with the questions
> Sorry if some of them are clearly in RTFM category
> TIA Andy
> 
> 
> asm("
>         .text  ; what does the dot here mean?
>         .globl  _start ;what does .globl  mean ?

William and Dan covered this well.

> _start:
>         li      8,0  	; load zero into register r8 . Why ?
>         li      9,0x100  ; load 0x100 into r9 Why  ?
>         mtctr   	; copy the contents of r9( 0x100 in this case) into
> the count register .Why

So that they aren't used uninitialized below. :-) Note also that since the
mt & mf commands move to/from special registers and normal registers, to
put 0x100 in the ctr, it has to go through a normal register (here r9).

> 1:
>         dcbf    0,8
>  ; data cache block flush - the effective addr in this case = r0 | 0 + r8 -
> what's in r0? why is this done?

Not quite. Remember how Dan was saying that sometimes a "0" in the
register field got you r0 and sometimes it got you 0? The "(rA|0)+(rB)"
nomenclature is saying that if there's an "0" in the rA field, you realy
get zero rather than r0. So this is really just dcbf r8.

>         icbi    0,8 	;instruction cache block invalidate - the EA is
> calculated as above - again why?

OF runs (is supposed to run)with PA == VA. We're about to load stuff (the
kernel) into physical memory, and don't want the cache to crap on us. So
we flush it out.

>         addi    8,8,0x20 ;in this case :r8  = r8+0x20 = 0x20. Correct? Why?

Cache lines are 32-bytes wide. Thsi points r8 at the next one.

>         bdnz    1b	; -- decrement Count Register and branch to
> location 1b if CTR is not zero. why?? What's at location 1b?

That's gas syntax. Since it's a relative branch, there's nothing at
address 1 ehich matters. :-)

You can re-use numeric labels to your heart's content. So there could be a
label 1 behind you, and a label 1 in front of you. "1b" means the label
"1" behind you. This trick only works with numeric labels. I'm not sure,
but I bet it's so that the compiler doesn't have to remember what labels
it has used, it can re-use them in canned code.

>         sync ; synchronize
>         isync ; instruction synchronize

Make sure everything has happened before proceeding, and blow away the
cache.

>         lis     1,stack@ha 	 ; r1 = stack<<16 -- what does @ha mean?
>         addi    1,1,stack@l 	; r1 = r1 | 0 +stack - what does @l mean ?
>         addi    1,1,4096  	 ; r1  = r1 | 0 +4096 -- ditto
> 			what's the effect of this  ? why is it done?

As William mentioned, RISC processors have all the instructions the same
size, and that size happens to be one register (32-bits) on powerpc. Thus
the lis, addi command pair. I believe that the @ha and @l directives tell
the assembler to emit the upper or lower most significant words (16-bits)
of the 32-bit quantity, "stack" (the address of the array "stack".

4096 is added to r1 as stacks grown down on powerpc. So we want the
address of the top, not the bottom, of the array. :-)

>         mfmsr   8 	;r8  = mfmsr (machine state reg)
>         li      0,0	; r0 = 0 ?
>         mtmsr   0	; msr = r0 ( or zero since we just moved 0 into r0)
>         isync 	' instruction sync

Among other things, the msr contains a bit which enables the MMU
translation. Since we're about to play with the BAT's, we really don't
want to be using them.

OF is supoed to be running in real mode (no translation), but some systems
run with translation on but mapped such that VA == PA. So nothing bad
happens when we turn off the MMU. But we make sure that no instructions
that were fetched before that are used (isync).

>     'why all this manipulation with the instruction BAT upper registrers?

Do you really want to use them before you know what's in them? :-)

>         mtibatu 0,0
>         mtibatu 1,0
>         mtibatu 2,0
>         mtibatu 3,0
>         mtdbatu 0,0
>         mtdbatu 1,0
>         mtdbatu 2,0
>         mtdbatu 3,0
> 
>         li      9,0x12          /* BATL(0, BAT_M) */
>         mtibatl 0,9
>         mtdbatl 0,9
>         li      9,0x1ffe        /* BATU(0) */
>         mtibatu 0,9
>         mtdbatu 0,9
>         isync
> 
>         mtmsr   8
>         isync

Now that the BAT's are initialized, we can restore the MSR.

>         b       startup ; well this one i understand:-))
> ");

Take care,

BIll