Subject: Re: amigappc: random hangs during startup
To: Frank Wille <frank@phoenix.owl.de>
From: Matt Thomas <matt@3am-software.com>
List: port-powerpc
Date: 01/19/2007 13:26:25
Frank Wille wrote:
> Hi!
> 
> After several days of debugging I would be grateful for a hint from one of
> the experts, otherwise I have run out of ideas.
> 
> The situation:
> 
> My kernel is loaded with a custom bootloader, Michael van Elst's gobsd,
> adapted for ELF and PPC by me and Gunther Nikl. The PPC's BATs and SRs may
> be in undefined state. MSR is PSL_IP and HID0 is zero, when the bootloader
> copies the kernel to its target address (0x08000000 in this case). It will
> also flush the data- and instruction-cache for the kernel area, although the
> caches should be off. Then jumps into locore.S (with rfi from reset vector).
> 
> locore.S clears the BATs and SRs, disables all interrupts and DMA,
> calculates startsym/endsym and calls initppc() and main(). The usual
> procedure. Caches will be invalidated and enabled before calling main().
> 
> Unfortunately the kernel sometimes hangs during initppc(). Often in
> ksyms_init() or pmap_bootstrap(), but other locations were also observed. It
> nearly hangs everytime when using kernel-symbols, but also rarely without.
> 
> I dumped all registers and important variables (like startsym, endsym,
> endkernel, etc.) to memory for postmortem analysis, and everything looks
> fine.
> 
> Frustrating is, that inserting debugging code sometimes moves the crash
> location, or lets the kernel boot normally. And as the crashes are usually
> before main(), I have no console, which doesn't make it easy.
> 
> I'm running with an adapted trap_subr.S, which works with exception vectors
> at 0xfff00000 (PSL_IP). And I'm setting three IBATs and three DBATs at the
> beginning of initppc() for128M at 0x08000000 (RAM), 512k at 0xfff00000
> (exception area) and 16M at 0x00000000 (slow Amiga "Chip-RAM" for DMA and
> memory-mapped I/O region). This is a 604e system.
> 
> As I'm quite new to kernel programming, do you have any ideas where I could
> look for? Does this behaviour remind you on something you experienced
> yourself?
> 
> Grateful for any hints...

Any devices in 0xd..._.... (KERNEL_SR) or 0xe..._.... (KERNEL_SR2)?  or 
0xc..._.... (USER_SR)?  The BAT code doesn't really like sub-256M BATs.