NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: GPT missing after controller swap (wd to ld)



On Oct 10,  3:39pm, Louis Guillaume wrote:
} On 10/10/16 12:26 PM, Christos Zoulas wrote:
} > On Oct 10, 11:35am, louis%zabrico.com@localhost (Louis Guillaume) wrote:
} > | On 10/10/16 8:14 AM, Christos Zoulas wrote:
} > | > In article <4c31b2ab-3161-9dc1-2d2b-74278c891065%zabrico.com@localhost>,
} > | > Louis Guillaume  <louis%zabrico.com@localhost> wrote:
} > | >>
} > | >> On NetBSD 7.0_STABLE, i386, (from earlier this year) I had configured a
} > | >> couple of disks with GPTs and then switched controllers (wd1 now shows
} > | >> up as ld1) and now the GPT appears to have gone missing. However my
} > | >> wedges are still there...
} > | >>
} > | >> # gpt show ld1
} > | >> gpt: error: map entry doesn't fit media
} > | >> gpt: unable to open device 'rld1d': No such file or directory
} > | >>
} > | >> # dkctl ld1 listwedges
} > | >> /dev/rld1d: 2 wedges:
} > | >> dk0: boot0, 524288 blocks at 128, type: ffs
} > | >> dk1: disk0, 3906504704 blocks at 524416, type: raidframe
} > | >>
} > | >> The first wedge is intended to just hold a kernel and emergency root
} > | >> file system for booting. The root file system is on a raid array built
} > | >> with dk1 (and others). I'm currently booted off a separate disk because
} > | >> the bootmenu did not include the boot.cfg "installboot"-ed to ld1.
} > | >>
} > | >> Did something in the BIOS overwrite the GPT? How to recover?
} > | >
} > | > Can you try gpt show /dev/rld1d? Also show your dmesg?
} > |
} > | The gpt output is similar but not identical...
} > |
} > | # gpt show /dev/rld1d
} > | gpt: error: map entry doesn't fit media
} > | gpt: unable to open device 'rld1d': Undefined error: 0
} > |
} > | Also /dev/dk3 will not reconstruct on the raid (raidctl -R /dev/dk3
} > | raid0), leaving these messages...
} > |
} > | raid0: initiating in-place reconstruction on column 1
} > | raid0: IO failed after 5 retries.
} > | raid0: IO failed after 5 retries.
} > | raid0: Recon read failed: 22
} > | raid0: reconstruction failed.
} > |
} > | But no low-level (ld or twa) IO messages.
} > |
} > | At this point I'm thinking of this procedure for recovery...
} > |
} > |     o Wipe ld2 and re-partition
} > |     o Rebuild a new raid on ld2 (which has dk3)
} > |     o Copy everything over
} > |     o Boot from the new raid disk
} > |     o Wipe ld1
} > |     o Rebuild the raid with the new ld1
} > |
} > | Relevant parts of the dmesg.boot are below. Thanks for looking!
} >
} > Before you do anything, can you please rebuild the gpt binary from
} > HEAD and see what that prints?
} 
} Pretty sure I did this right. My src/ tree is from the netbsd-7 branch. 
} I did the following. Hopefully this is enough to get what you're looking 
} for...
} 
} $ cd /usr/src/sbin/gpt
} $ cvs up -A -dP
} cvs update: Updating .
} P Makefile
} [snip]
} 
} $ TOOLDIR=/usr/obj/TOOLDIR.i386 make
} 
} .....
} 
} # ./gpt show /dev/rld1d
} gpt: /dev/rld1d: map entry doesn't fit media
} 
} # ./gpt show ld1
} gpt: /dev/rld1d: map entry doesn't fit media
} 
} # ./gpt -vvvv show ld1
} /dev/rld1d: mediasize=1999988850688; sectorsize=512; blocks=3906228224
} /dev/rld1d: MBR not found at sector 0
} /dev/rld1d: Pri GPT at sector 1
} /dev/rld1d: GPT partition: type=ffs, start=128, size=524288
} /dev/rld1d: GPT partition: type=raid, start=524416, size=3906504704
} gpt: /dev/rld1d: map entry doesn't fit media

     Yes, the disk changing size is the problem.  Do the math on
the second partition:  start + size > blocks (524416 + 3906504704
= 3907029120).  That disk is now seriously corrupted as you have
lost approximately 400MB from the end of the second partition (not
to mention the backup GPT).  This is not something that can be
automatically recovered.  We likely need an option to force gpt(8)
to ignore invalid entries, so that you can manipulate the GPT and
turn it back into a valid one.  Better would be to display them,
but mark them as invalid, but that would require significant
restructuring.*

* gpt(8) is an exercise in linked list manipulation where the nodes
just happen to represent various on-disk structures.  (start +
size) is used extensively in the routines for searching the list
and inserting new nodes.  My project list has included getting rid
of the stupid linked list for some time.  "map entry" refers to a
node in the linked list.  Essentially the problem is that adding
a node for the second partition to the linked list would violate
the invariant the linked list represent the disk since it would
now be bigger then the disk.  Any issue in initialising the linked
list causes gpt(8) to abort.

}-- End of excerpt from Louis Guillaume


Home | Main Index | Thread Index | Old Index