Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Heads up! amd64 secondary bootstrap broken?



On Nov 27,  6:00am, Paul Goyette wrote:
} On Sat, 11 Aug 2012, John Nemeth wrote:
} > On Nov 26, 10:09pm, Paul Goyette wrote:
} > }
} > } I just started an update of one of my machines, and things are now quite
} > } broken!
} > }
} > } I updated my bootblocks as well as installing a new 6.99.10 kernel.  I
} > } used the following commands:
} > }
} > }   # cp $DESTDIR/amd64/usr/mdec/boot /boot
} > }   # installboot -v /dev/wd0a $DESTDIR/amd64/usr/mdec/bootxx_ffsv1
} > }
} > } The machine is now dead.  :(
} > }
} > } It boots the primary bootstrap and presents the menu.  It allows me to
} > } boot memtestplus (from an old install!).  However, when trying to boot
} > } either the old or new NetBSD kernel, it gets as far as trying to mount
} > } the root file system, and stops with
} > }
} > }   boot device wd0
} > }   root on wd0a dumps on wd0b
} > }   cannot mount root, error = 79
} > }   root device (default wd0a):
} >
} >     This has very little to do with the boot code.  The primary
} > purpose of the boot code is to load the kernel and execute it.  The
} > first message from the kernel is the copyright.  At this point, the
} > kernel has completed autoconf (i.e. it has been running for a while).
} > Did you look up the error code?
} >
} > P4-3679GHz: {257} grep 79 /usr/include/sys/errno.h
} > #define EFTYPE          79              /* Inappropriate file type or 
format */
} >
} > Did your dmesg show the correct device for wd0?  What does it show for
} > drives?  Does your kernel have the correct filesystem for your root
} > filesystem?  Note: one relatively recent change to /boot is that it no
} > longer attempts to load ffs.kmod.
} >
} > } Of course, I'm using a USB keyboard which isn't working at this point,
} > } so I can't try much of anything else.
} > }
} > } I'm going to pull the drive out of the machine and use another box to
} > } reinstall a working bootxx_ffsv1, but I want to give folks a heads-up
} > } that there might be some breakage here.
} >
} >     bootxx_ffsv1 is there strictly to load /boot.  It obviously did
} > that, so it is working fine.  So far, you haven't provided any evidence
} > of breakage in the boot code.
} 
} Well, duh, yeah, you're right, of course - the boot code by this time 
} has finished all of its work.
} 
} HOWEVER,
} 
} I did as I had threatened - removed the hard drive and installed it in 
} one of my other machines which still had original boot code.  I 
} installed the earlier /boot and mdec/bootxx_ffsv1 onto that disk, and 
} then returned it to the original machine.
} 
} That machine now boots just fine all the way to multi-user.
} 
} So, while the boot code _should_ have long done its job and been gone, 
} restoring the original boot code does fix the problem.
} 
} Perhaps there's something wrong with some of the parameters that get 
} passed into the kernel from the boot code?

     That is potentially possible.  I originally was going to say,
"This has nothing to do with the boot code."  I s/nothing/very little/
when it dawned on me that there was a small probability of it being bad
parameters.  However, you haven't answered the other questions.  Does
the dmesg show your drive(s)?  That would answer the question of bad
parameters.  You also didn't answer the question of whether your kernel
has the appropriate filesystem (most likely ffs).  As mentioned, a
relatively recent change to /boot is that it will no longer attempt to
load ffs.kmod.  If your kernel doesn't have ffs builtin, then you will
definitely see the error from above.  One way to test this; is to put
back the new /boot, stop the countdown, type "load ffs", then type
"boot".

}-- End of excerpt from Paul Goyette


Home | Main Index | Thread Index | Old Index