Subject: Re: MBR not working (1.5)
To: None <port-i386@netbsd.org>
From: Anne Bennett <anne@alcor.concordia.ca>
List: port-i386
Date: 11/21/2001 13:52:24
Wolfgang Solfrank <ws@tools.de> answered my query asking why my system
was failing to boot, after I had installed the following MBR and NetBSD
disklabel:

>> | Partition table:
>> | 0: <UNUSED>
>> | 1: <UNUSED>
>> | 2: <UNUSED>
>> | 3: sysid 169 (NetBSD)
>> |     start 0, size 16 (0 MB), flag 0x80
>> |         beg: cylinder    0, head   0, sector  1
>> |         end: cylinder    0, head   0, sector 16

>> #        size   offset  fstype [fsize bsize   cpg]
>>   a:    64512       63  4.2BSD   1024  8192    16   # (Cyl.  0*- 48*)
>>   b:  1038177    64575    swap                      # (Cyl.  48*- 820*)
>>   c:  8887137       63  unused      0     0         # (Cyl.  0*- 6612*)
>>   d:  8887200        0  unused      0     0         # (Cyl.  0 - 6612*)
[...]

Wolfgang, that was the clearest explanation of the i386 boot process
I have ever read!  Thank you!  Perhaps something like your explanation
should end up in the boot(8) manpage -- I had certainly read that page,
as well as installboot(8), fdisk(8), disklabel(8), and a few others,
at least twice each, without managing to piece together what you have
so clearly explained.

I now understand that since my active MBR partition effectively points
to the MBR itself, and the MBR bootcode there is the "standard" code and
not the NetBSD first stage boot code, the machine gets into an infinite
loop loading the bootcode from the MBR.  I believe I also understand that
my two alternatives are to move my NetBSD partitions a and c to start
at sector zero so that the MBR "standard" boot code is replaced by the
NetBSD first-stage boot code, *or* to change my MBR partition
definitions so that the NetBSD part of the disk (partition 3) starts
at sector 63 instead of 0, so that the standard MBR boot code will
correctly load the NetBSD first-stage boot code.  Whew.

> When shifting the MBR partition to start at offset 63 [...] you'll also
> shift the NetBSD partition table.  Therefore you'll have to re-disklabel
> the partition after doing the fdisk thing with the same data it has now.

This part I don't understand; isn't my disklabel at the start of partition
a/c already?  Hmm, maybe not, since, after loading the first-stage boot
from floppy, I was able to speficy "boot sd0a:netbsd", which suggests
that the MBR was pointing correctly to the NetBSD disklabel -- or does
it?  How *does* the first-stage boot program find the rest of the NetBSD
bootcode and the NetBSD partition table when I have pointed it at a
different device than that from which the first-stage boot was loaded?

But now I really don't get it --  I was initially able to label that disk
and put data on it even while it had a completely invalid MBR, and playing
with the MBR (with fdisk) did not appear to damage my NetBSD disklabel.
How does "disklabel" decide where to put the label?  How does it find
"the NetBSD part of the disk", especially if the MBR partition table is
not valid?


Anne.
-- 
Ms. Anne Bennett, Senior Analyst, IITS, Concordia University, Montreal H3G 1M8
anne@alcor.concordia.ca                                        +1 514 848-7606