Subject: Re: upgrading bootblocks
To: Chris Tribo <ctribo@college.dtcc.edu>
From: Tim Kelly <hockey@dialectronics.com>
List: port-macppc
Date: 12/13/2004 08:08:22
On Mon, 13 Dec 2004 02:26:37 -0500
Chris Tribo <ctribo@college.dtcc.edu> wrote:

> I just swapped in a new hard drive that had XP on it previously and
> put 2.0 on it. It still won't boot. So it's even more unlikely that
> this is an HD hardware related problem.

Comparing your tracing efforts to Chaz', there are some differences.
Chaz' registers:

0 > boot scsi-int/sd@3:0 NETBSD.MACPPC DEFAULT CATCH!, code=FFF00300
  ok
0 > .registers dev /memory .properties
Client's Fix Pt Regs:
  00 DEADBEEF 00E0EB00 DEADBEEF 00E0C840 00E0A864 00E0CAD0 00000004 
00000002

  08 00E0F090 FF8099B8 00000000 00000000 00000000 DEADBEEF DEADBEEF 
DEADBEEF

  10 DEADBEEF DEADBEEF DEADBEEF DEADBEEF 000042F4 00000200 FF8099B8 
00000000

  18 00000000 FF8D7C80 00010000 007E0000 000043BC 00000078 000043A0 
FFFFFFFF

Special Regs:
     %IV: DEADBEEF   %SRR0: 006000D8   %SRR1: 00003070
     %CR: 35ADBEEF     %LR: 006000D8    %CTR: FF8099B8    %XER: C000BE6F
    %DAR: 00001000  %DSISR: 42000000   %SDR1: 00FE0000

His bootxx jumped to 0x600000 but the ofwboot version expected to be
loaded at 0xe00000. Your registers:


Client's Fix Pt Regs:
  00 00000000 00E0EB40 DEADBEEF 00000000 00000000 FF809C78 DEADBEEF 
DEADBEEF

  08 00003070 00000002 DEADBEEF 00000000 00160400 DEADBEEF DEADBEEF 
DEADBEEF

  10 DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF DEADBEEF 
00E00000

  18 00000000 FF809C78 FF8D7980 00000000 00004424 00000B02 00E0E000 
00000007

Special Regs:
  %IV: 00000700	%SRR0: 00E00040	%SRR1: 00080000
  %CR: 3EADBEEF	%LR: 0000436C	%CTR: 00E00000	%XER: C000B36F
%DAR: 00001000	%DSISR: 42000000	%SDR1: 00FE0000
  ok


Your bootxx jumped to 0xe00000, and it looks like there's initially
valid PowerPC opcodes:

%SRRO 10 - dis

00E00030: 7C1683A6
00E00034: 7C1883A6
00E00038: 7C1A83A6
00E0003C: 7C1C83A6
00E00040: F1087064
00E00044: 0000A1A3
00E00048: 120E4657
00E0004C: 422C4A61
00E00050: 636B4861
00E00054: 6D6D6572
00E00058: 02011204
00E0005C: 73637369
00E00060: 011A1206
00E00064: 31303230
00E00068: 56310119
00E0006C: A5A50103
00E00070: 0113A501
00E00074: 11A50111
00E00078: 01120112

%SRR0 is the next instruction to be executed, and was always 0xE00040. I
ran this through a couple utilities in MacOS. The code looked off-track
from E00040 down (look in the last column for the matching opcodes from
your dis readout).

 Disassembling PowerPC code from 086791D0
  No procedure name
086791D0   mtibatu    0x03,r0       ; IBAT3U = 0x0216 | 7C1683A6
086791D4   mtspr      DBAT0U,r0  ; 0x0218   | 7C1883A6
086791D8   mtspr      DBAT1U,r0  ; 0x021A  | 7C1A83A6
086791DC   mtspr      DBAT2U,r0  ; 0x021C  | 7C1C83A6 <--- our last real
opcode

The rest is junk:
086791E0   dc.l       0xF1087064                | F1087064
086791E4   dc.l       0x0000A1A3               | 0000A1A3
086791E8   dc.l       0x120E4657                | 120E4657
086791EC   bdnzl+  $+0x4A60  ;  0x0867DC4C | 422C4A61            
086791F0   ori        r11,r27,0x4861             | 636B4861
086791F4   xoris      r13,r11,0x6572            | 6D6D6572
086791F8   dc.l       0x02011204                 | 02011204
086791FC   andi.      r3,r27,0x7369             | 73637369
08679200   dc.l       0x011A1206                 | 011A1206
08679204   addic      r9,r16,0x3230             | 31303230
08679208   rlwinm.    r17,r17,0x00,0x04,0x0C  | 56310119
0867920C   lhzu       r13,0x0103(r5)             | A5A50103
08679210   dc.l       0x0113A501                 | 0113A501
08679214   dc.l       0x11A50111                 | 11A50111
08679218   dc.l       0x01120112                 | 01120112

In bootxx.c, this code is responsible
for loading the secondary bootloader into memory:

for (j = 0; j < bbinfo.bbi_block_count; j++) {
       if ((blk = bbinfo.bbi_block_table[j]) == 0)
             break;
       putc('0' + j % 10);
       OF_seek(fd, (u_quad_t)blk * 512);
       OF_read(fd, addr, bbinfo.bbi_block_size);
       addr += bbinfo.bbi_block_size;
}

The problem is that OF's read is a raw disk reader. It has no knowledge
of the file system underneath. When the replacement copy of ofwboot was
moved into place, the file system did not make it contiguous or
overwrote it later, and OF just read the number of bytes it was told to
read.

I've seen this before, when doing a lot of kernel testing, where either
the bootloader or the kernel would get discontiguous. However, the
problem appears a tad more subtle. The first 64 bytes of the file are
intact. It is after this point that something else is present. This
seems odd to me, that a file would get broken up so early, and that it
would use such a small portion of the block fragment.

So if I understand Izumi's directions and the man installboot pages,
we'll want to verify the integrity of two files: /usr/mdec/ofwboot and
/ofwboot. We can do this with

hexdump /ofwboot > ofwboot.hexdump
hexdump /usr/mdec/ofwboot > ofwboot.mdec/hexdump

Since the problem is occuring within the first 40 instructions, just
send me the first 256 (0x100) bytes from each, but also, in order to
check the version send me 256 bytes of the output from around 000c900.

thanks in advance,
tim