Subject: Re: disklabel won't! (WAS: Re: Bootability eludes me once again)
To: None <port-i386@netbsd.org>
From: Anne Bennett <anne@porcupine.montreal.qc.ca>
List: port-i386
Date: 02/23/2003 13:11:54
["disklabel -r" saw my NetBSD disk label but "disklabel" did not]

Dancing in the streets to the music of choirs of angels!
I have a working disklabel.  :-D

What seemed to fix it was to zero out the first part of the disk,
including the first three sectors of my desired partition "a:" (which
has data on it), then reinstall the MBR and the disklabel.  My data
came through intact.

Thanks to Frederick Bruckman <fredb@immanent.net>, whose suggestion
to use "dd | strings" to find out what was going on led me in the
right direction.  It turns out that this problem was *not* caused
by a CHS translation issue at all; I put my NetBSD partition where
I wanted it in the first place (sector 63), and even though that
it not on (BIOS's) track zero of the disk, it works.

I do believe that there may be a bug in the DIOCWDINFO ioctl, though.
I looked a bit at disklabel.c, and, as per the comment in it, the
disklabel is read differently based on the flags.  If "-r" or "-I" is
given, the disk label is read from the raw device using an offset
calculated in disklabel.c; that worked properly all along.  If neither
of those flags in specified, then the DIOCWDINFO ioctl is used to
fetch the label, and that failed until I zeroed out part of my disk
and reinstalled the MBR and disklabel from scratch.  Therefore, it
seems as though DIOCWDINFO is confused by the presence of additional
label or MBR-like material on the disk.

I started having problems when I used "fdisk" on "sd0a" instead of
"sd0" (d), last May.  I may have compounded those problems recently
by trying to move my NetBSD partition's start sector (from 64 to 63,
then to 3).  In its "messed up state", the beginning of sd0 contained:

  - MBR claimed that NetBSD started at sector 3
  - disklabel -r showed a disklabel that had:
      - c: at sector 3
      - a: at sector 63
  - my disklabel "name string" was found on these sectors:
      - 1   (perhaps at some point I used the whole disk for NetBSD?)
      - 17  (I can only surmise that some point in my experiments or
             in a previous O/S version, I started NetBSD at sector 16...)
      - 65  (from when my NetBSD partition started at sector 64)
  - the strings "No NetBSD part", "Boot fail" were found on sectors:
      - 16
      - 64
  - the strings "netbsd.old.gz" and other kernel names were found on
    sector 5.

I did not think of it before I zeroed things out, but I should have
relabelled the disk with -r and changed the name string, to figure out
where "disklabel -r" was putting that label. Sorry about that.

What I did:

  dd of=/dev/rsd0d count=66 if=/dev/zero
  fdisk -aui sd0   (put NetBSD at sector 63)
  fdisk -B sd0
(at this point, only sector 0 showed strings)
  disklabel -I -e sd0
(at this point, sector 64 contained the label)

I now have a sane system again.  Whew.


Anne.
-- 
Ms. Anne Bennett, as a private citizen:  anne@porcupine.montreal.qc.ca
Also reachable more officially at work:  anne@alcor.concordia.ca
-----------------------------------------------------------------------
#!/bin/sh
# Report on first few sectors of the disk using "strings"

disk="sd0"

for i in `count from 0 to 70`
do
  echo ========== sector $i
  dd if=/dev/r${disk}d count=1 skip=${i} 2>/dev/null | strings
done
-----------------------------------------------------------------------