Subject: Re: Bombing sun3 installs...revisited
To: BSD Bob <bsdbob@weedcon1.cropsci.ncsu.edu>
From: Manuel Bouyer <bouyer@antioche.lip6.fr>
List: port-sun3
Date: 07/31/1999 16:21:12
On Fri, Jul 30, 1999 at 02:23:55PM -0400, BSD Bob wrote:
> Well, after a lot of headscratching, I was able to load 1.3.2, just
> fine.  I have not had a chance to load test it for read/write to see
> if all those scsi bus timeout problems I had earlier are still there.
> But, 1.4 still blows up.   A little constructive criticism may
> be in order, not of anyone or anything in particular, but just to
> help others from not stumbling all over themselves, like I did.
> 
> 1.  Part of my problem was that the ftp on netbsd.org was breaking links
>     in the calls out of the sun3 tree, and my stoopid AIX ftp did not
>     return them correctly.  That is partially my problem, but curable
>     if I do a pwd in each and every tree level to make sure the link
>     is correct and not broken.  That is partially NetBSD's problem,
>     in that one should NEVER link across architectures, for any reason.
>     Always cp the common files into the target architecture and NEVER
>     USE LINKS.  BAD KARMA, BAD JUJU, SLEEPLESS NIGHTS ENSUE, IF LINKS
>     BREAK FOR ANY REASON (like a dumb ftp on my end).  For days, I have
>     been trying to boot M68K code, it seems....  Bad karma on sun3.

This is done because a lot of different archs share some m68k bits, this
saves a lot of space on the ftp server.
I think your ftp client is definitively brocken here, I've never had
a problem because of this.

> 
>     Also, the mget of the entire sun3 tree as a tarball should return
>     the files and not the broken links.   That is a problem on NetBSD's
>     end, I would expect.  It should always default to the files and not
>     the links, if possible.  There should be a switch in tar when it
>     is called, to do that, I would think.
> 
> 2.  In the 1.4 install script on the booted miniroot from sd(0,0,1) -s,
>     you need a small fix so that you don't have to invoke edlabel at
>     every boot, even if the disk is already edlabeled.  On my machine,
>     it hung anytime I did not pass through edlabel, again, first.
> 
> 3.  In the 1.4 install script welcome menu, and in the various and sundry
>     incantations of INSTALL(8),  it should be fsf 3 and not fsf 2.  IFF
>     you choose fsf 2, you inadvertently install the sun3x boot kernel
>     instead of the miniroot, and don't catch it until it tries to boot,
>     and have to reinstall from the ground up, again.  Bad juju.
> 
> 4.  In the miniroot boot, where it starts, it calls an error due to
>     a bombed ld.so file...../usr/lib/ld.so: warning: libc.so.12.20:
>     minor version >=40 expected, using it anyway.  It sounds like the
>     classic wrong ld.so file.  It causes further errors that lock up
>     the machine because it can't find or interpret a root disk query
>     call.
> 
> 5.  Next, an undefined symbol is found in /usr/libexec/ld.so
>     "___sigaction14" called from  sort: sort at 0x60b4.
>     I would expect it is trying to sort a bogus drive list, from
>     where ld.so bombed, above.

This is related to 4. You're using an older libc version, which lacks some
symbols.
1.4's libc is definitivelt 12.40, I don't know from where 12.20 comes.
You should be able to extract libc.so.12.40 from base.tgz.

> 
> 6.  Finally, where it asks for available drives as root, the code
>     gets lost, and anything that is entered is improperly assessed,
>     and it locks up calling for the root disk, the root disk, the
>     root disk..... time to hit the little white reset button on the
>     back of the cpu card......

Maybe because it couldn't get the list of disks before ...

> 
> Although it appears to be a scrambled ld.so problem, it nukes my
> install at that point.  So, something is still not quite right,
> in  1.4 on the 3/260 box.  It died similarly on the 3/110 box.
> How are you folks getting it beyond that on  3/60's or 3/80's?
> 
> Any ideas as to how to get around it?

Try to find a libc.so.12.40 (I'm pretty sure there is one in the
base.tgz file).

If you have 1.3.2 installed, you can 'upgrade' to 1.4 this way:
1) mv /netbsd /netbsd.old, extract kern.tgz, and reboot single user
2) once booted the 1.4 kernel, you can start extracting the 1.4 userland bits
   this way: first remove /usr/include 'rm -rf /usr/include', then
   cd / && tar --unlink -xvzf /..../base.tgz
   do the same for others sets.


--
Manuel Bouyer <bouyer@antioche.eu.org>
--