Subject: Re: new toolchain problems
To: Rafal Boni <rafal@mediaone.net>
From: sgimips NetBSD list <sgimips@mrynet.com>
List: port-sgimips
Date: 01/28/2002 12:54:45
> In message <200201281442.g0SEgJK67524@mrynet.com>, you write: 
> 
> -> I've been struggling for quite a long time to get a snapshot built.
> -> 
> -> At first, I was struggling with this on an R5K INDY.  This machine
> -> would regularly hang--no pattern to when or where.  Only power
> -> cycling the machine would get it going again.  
> -> 
> -> So, this weekend I moved to an R4400, incase I had a hardware
> -> problem.  I should mention that the R5K works fine using one
> -> of my snapshots from late December.
> 
> Hmm, ENOR5k, so I can't help you here 8-/

Once I get a build completed, I'll install on that machine and
we'll know the answer here.

> -> On the R4400 now, the following race condition occurs:
> -> 
> -> install ===> libc
> [...]
> -> STRIP=/sgimips/src/tools/obj/tools.NetBSD-1.5Z-mipseb/bin/mipseb--netbsd-str
> -> ip
> -> /sgimips/src/tools/obj/tools.NetBSD-1.5Z-mipseb/bin/nbinstall  -l s -r libc.
> -> so.12.82  /usr/lib/libc.so
> -> Segmentation fault - core dumped
> -> *** Error code 139
> -> 
> -> Once that libc.so link is created, the majority of dynamic executables
> -> on the machine are rendered inoperable, resulting in the same
> -> Segfault.
> -> 
> -> I'm at a loss as to what I should do at this point.  Any suggestions?
> -> Even more importantly, has anyone been able to build a snapshot this
> -> year?
> 
> MAKE SURE TO BUILD & INSTALL ld.elf_so FIRST!!! (sorry to yell, but it is
> *really* important).  Once that's done, you should be OK.
> 
> Probably the correct method to build is:
> 	build.sh -t
> 	(maybe do a make includes)
> 	(now with new tools) build & install ld.elf_so
> 	build.sh to build rest of system.

The build.sh -t is new to me, and I hadn't performed the make includes
previously, so using this method it'll hopefully succeed.  I'll post
the results.

> I built a complete snapshot shortly before the new year (either right after
> or right before Jason switched to NEW_TOOLCHAIN); I haven't had much time to
> play with my SGI gear since, but I did start looking at your serial console
> crash last week -- the good news is that it's reproducible here, the bad is
> that it's baffling (though I only had a little time to look at it last week).

Oh, man.  I finally feel like I'm not (excessively) insane... I thought I was 
the only person having console problems.

By the way, I do still have a machine that I absolutely cannot run on.
Its PROM is the only difference to another 4400:
	PROM Monitor SGI Version 5.1.2 Rev B4 R4X00 IP24 Dec  9, 1993 (BE)
The working machine's PROM is:
	PROM Monitor SGI Version 5.3 Rev B10 R4X00/R5000 IP24 Feb 12, 1996 (BE)

The kernel hangs loading at the following point:

	Mem block 1: type 0, base 0x0, size 0x1
        Mem block 2: type 1, base 0x1, size 0x1
        Mem block 3: type 5, base 0x8002, size 0xc
        Mem block 4: type 3, base 0x800e, size 0x732
        Mem block 5: type 6, base 0x8740, size 0xc0
        Mem block 6: type 3, base 0x8800, size 0x3800
        Loading cluster 2: 0x8002 / 0x800e
        Cluster 3 contains kernel
        Loading chunk before kernel: 0x800e / 0x8069
        Loading chunk after kernel: 0x81ee / 0x8740
        Loading cluster 5: 0x8800 / 0xc000

(This is before the kernel NetBSD banner appears).

Any recommendation for what info I can provide or what I should
do to debug this?

Cheers,
-scott