NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Prepping to install: a digression



On 07/11/15 11:06, William A. Mahaffey III wrote:
On 07/06/15 15:52, William A. Mahaffey III wrote:
On 07/04/15 09:51, William A. Mahaffey III wrote:
On 07/01/15 19:38, Robert Elz wrote:
     Date:        Wed, 01 Jul 2015 10:40:24 -0453
     From:        "William A. Mahaffey III" <wam%hiwaay.net@localhost>
     Message-ID:  <55940871.60702%hiwaay.net@localhost>

   | However, this time
| I can boot back into the boot media when I plug it in & reboot, I think | because I *didn't* do the 'raidctl -A root raid0' command during my
   | shell excursion.

That would be why - and you really do NOT want to do that until you are
certain that everything is correctly set up and working.

Boot back to the state you showed at the end of LIST.setup2.txt (the
output from setup0 and setup1 was not useful - that's just stuff working
normally, we do not need to see that).

That is, boot with root on sd0a and the (later intended) root on /altroot with /altroot/usr also mounted (/altroot/home should make no difference one
way or the other).

Next
    chroot /altroot

At that point run a bunch of commands and make sure everything is working (and check that /sbin/init exists and is executable - yoy won't be able to run that one). Check that /dev is sane (entries for the raids you need, the wd devices you have, console, null, ptys, ... (or completely empty).

Check, there are many entries in /dev, notable for all wd's, raid's, console, null, ptys, etc. Commands that I tried worked sanely. No man pages, but a few things in /bin & /sbin. I didn't try them all, but what I did worked sanely. If you need more specific info, don't hesitate to ask for it.


Run fdisk on wd0 (or whichever drive you intend to actually boot from),
(While you are still chrooted to /altroot).

See attached. Note that the attached was created a few days ago, *not* from the chrooted environment, however, I wrote down most of what I thought was the critical info from the chroot'ed output, & it is identical to the attached file. Fdisk info for wd1 is identical, w/ only the partition referenced different. I also attach disklabel info for wd0, & again, wd1 is identical except for referenced disk.


Make sure it is correctly set up, it should have an MBR, or PMBR, and
should be marked as bootable, with a bootable partition on it, and boot code correctly installed. Make sure you can understand how that code is going to locate /boot (if you want it to use the one that is in /altroot, then the offsets of the partitions all need to be just right - you will need to get someone who has set up actually booting from a raid1 to verify
your setup, I don't run my systems that way, I prefer a separate boot
partition on wd0 (duplicated on wd1 or wd2 or whatever is appropriate).

See attached fdisk info, PBR is *not* bootable, so I guess I start there .... What next ?


Also check that the bios is set to actually boot from the drive you think, which can be tricky if you have a whole bunch of basically identical drives.
What the bios thinks of as the boot drive might not be the one you are
expecting.

BIOS boots from USB 1st, then HDD, w/ HDD order from 1 to 6, for 6 identical drives, possibly an issue as you allude to, but that is down the road for now.


For problems at that stage, what is important to see is not the raid setup, but the drive layouts, labels (fdisk, gpt, disklabel - whatever is actually
in use) of the boot drive, and the boot raid partition.

Once you have all that right, as best you believe it can be, boot without sd0 (the thumb drive, I assume) connected - in that state, if you get to
the state where the system looks to be booting, but cannot find a root
filesystem (that is, if the kernel boots, lists the hardware, etc, and then
fails to find root) then you're in a good situation.

If it is still unable to boot, you don't have the boot setup correct yet, and you will need to work on that part - making stde the MBR or PMBR is correct, installboot has been done correctly, and should be able to locate
/boot at one of the (very few) places it looks.

Once booting is right to the state of not finding root, and if you have done the chroot part above, and are fairly sure that the system is correctly installed and all the important parts are there, then you should reboot from sd0, and do the "raidctl -A root raid0" bit so that raidframe will make raid0a the root filesystem - then reboot again without sd0 and all should
be OK.

Finally, if you need to (almost) start all of this again (which you easily might) - skip everything related to /home. You don't need all that space just to get booted, and initing that 3.5TB raid takes a long time. Everything else should be fairly fast - so it is less painful to do it again, and again, until it all works. Once the system is properly up and running, you can easily configure that raid array using the running system, mount it on /mnt. copy whatever you might have added to /home in the interim to it, fix fstab to mount it on /home, and then reboot. But only after you can boot, and
shutdown and reboot, successfully, and with no hassles, without it.

kre

Anything on this, anyone ? I am thinking of booting back into the install shell, verifying the FS type (FFSv[1,2]) of raid0 (the intended root drive) & manually using installboot to install the bootxx_ffsv<whatever-raid0-is> onto the 2 underlying drives (rwd[0,1]a) & seeing if that changes FDISK's opinion of whether my PBR is bootable. Would that make any difference ? Please advise & have a good one.


OK, I did as I threatened (& as recommended on the installboot online man page), booted back into the install shell, found that the FS type of /dev/raid0a was FFSv1, installboot'ed /usr/mdec/bootxx_ffsv1 onto /dev/rwd[0,1]a (the 2 underlying devices of the (RAID1) raid0 device), fdisk'ed the 2 wd's & it did *NOT* have the line about 'active PBR not bootable'. That sounded promising, so I poweroff'ed, removed the install USB stick & powered up. It came up w/ the NetBSD boot screen & started to boot. It got past recognizing the 3 RAID devices, all properly sized, then got to the following, done by writing it down from the screen since I can't figure out how to capture the output any other way (clues :-) ?):

boot device: wd0
root on wd0a dumps on wd0b
vfs_mountroot: can't open root device
cannot mount root, error = 16
root device (default wd0a):
dump device (default wd0b):


& the process seems to be hung right there. I did hit CR after the root device prompt, it didn't seem to take it, timed out to the next prompt, where it has been for several min. now. All text is green on a black console screen if that matters. *Any* help appreciated :-) !!!!


Anything on this, anyone ? I am (re-)reading the online raidctl & installboot man pages, but I am out of ideas for now .... Any more info needed, please ask, & *any* help appreciated, I am stuck :-( ....

--

	William A. Mahaffey III

 ----------------------------------------------------------------------

	"The M1 Garand is without doubt the finest implement of war
	 ever devised by man."
                           -- Gen. George S. Patton Jr.



Home | Main Index | Thread Index | Old Index