Current-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

why does dk(4) take precedence in boot device selection???



I had the occasion to reboot one of my shiny new Xen servers today for
the first time in a month and I found that it failed to boot because of
the appearance since the previous successful boot of a new dk(4)
attachment created for a GPT partition on another drive.

	boot device: dk0
	root on dk0
	Supported file systems: union umap tmpfs smbfs puffs ptyfs procfs overlay null ntfs nfs msdos mfs lfs kern cd9660
	no file system for dk0 (dev 0xa800)
	cannot mount root, error = 79
	root device (default dk0): 

The problem here is that the system boots from sd0 and root is on sd0a!!!

Worse yet, dk0 is not even on sd0, it's a wedge on sd1:

	sd1 at scsibus1 target 1 lun 0: <DELL, PERC 6/i, 1.11> disk fixed
	sd1: fabricating a geometry
	sd1: 1861 GB, 1905664 cyl, 64 head, 32 sec, 512 bytes/sect x 3902799872 sectors
	sd1: fabricating a geometry
	sd1: GPT GUID: e171fce5-0937-49de-ab2a-399ac308a695
	dk0 at sd1: percraid0
	dk0: 3902795776 blocks at 2048, type: 

The server is running a recent-ish NetBSD 7.99.5 XEN3_DOM0 kernel
(from Feb. 20), under Xen-4.5.

I used the following commands to put a GPT label on sd1 and make a wedge
there for the dk0 device that I then use for LVM:

	dd if=/dev/zero of=/dev/rsd1d bs=8k count=1
	gpt create sd1
	gpt add -a 512k -l percraid0 sd1
	dkctl sd1 makewedges

As far as I know this should not make the wedge appear bootable, and I
would not expect the kernel to treat this wedge as special in any way --
i.e. especially not to override the boot device specified by the loader.

# dkctl sd1 listwedges
/dev/rsd1d: 1 wedge:
dk0: percraid0, 3902795776 blocks at 2048, type: 

Note the wedge "type" is blank.  The manual doesn't seem to list a wedge
type that would be valid for LVM use, though maybe ccd or swap or unused
would suffice, but except for this boot problem it works with no type.
I didn't do anything special to not select a type -- just the "makewedges".

I'm able to work around this with a "bootdev=sd0" in /boot.cfg, but that
doesn't seem like the right way, and I don't think it should be necessary.

Google searches suggest I'm not the only person who has been tripped up
by this issue.

Am I missing something here that I could do to change the wedge
configuration to avoid this issue?  Is it still so difficult to discover
which device the boot loader booted the kernel from on such a
semi-modern amd64 machine that the kernel can make such mistakes as
this?  If dk(4) is auto-configuring can it not at least look to see if
there's a valid filesystem on the device before it shoves itself in the
front of the line as the supposed "boot device"?  Should there be a
wedge "type" for LVM?

-- 
						Greg A. Woods
						Planix, Inc.

<woods%planix.com@localhost>       +1 250 762-7675        http://www.planix.com/
n

Attachment: pgpeE7gwBlr0V.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index