NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Xen boot strangeness (Was: Re: [SOLVED] Re: Xen 4.18.5_20250521nb0 not ELF binary (Was: Re: EFI and Xen))



On 5/30/2025 6:29 AM, Chuck Zmudzinski wrote:
> On 5/29/2025 4:02 PM, Greg A. Woods wrote:
>> At Thu, 29 May 2025 15:01:50 -0400, Chuck Zmudzinski <frchuckz%gmail.com@localhost> wrote:
>> ... 
>> 
>>> When I pass bootdev=dk12 in boot.cfg, the bootloader strangely tries dk1 as root
>>> (which is wrong) and correctly detects dk11 as the dump device. But it never
>>> gives me the chance to enter the correct root device and instead tries to load
>>> init which of course it cannot find the NetBSD init on dk1 because dk1 is not
>>> the correct NetBSD root device. In fact on this box a Linux distro is installed
>>> on dk1, as evidenced by the filesystem type detected on dk1: ext2fs.
>> 
>> Ah, I think that's a bug related to some bizarre/old hacks to find the
>> "booted_partion" for non-GPT disks:
>> 
>> 		if (strncmp(xcp.xcp_bootdev, devname, strlen(devname)))
>> 			continue;
>> 
>> 		if (is_disk && strlen(xcp.xcp_bootdev) > strlen(devname)) {
>> 			/* XXX check device_cfdata as in x86_autoconf.c? */
>> 			booted_partition = toupper(
>> 				xcp.xcp_bootdev[strlen(devname)]) - 'A';
>> 			DPRINTF(("%s: booted_partition: %d\n", __func__, booted_partition));
>> 		}
>> 
> ...
> A very simple sanity check that might fix this case would be to reject
> the match if the extra digit in the bootdev string is a numerical digit
> instead of a letter of the alphabet between a and p, because only such
> a letter would indicate that the two devices are related as a full disk
> device and a device that is a partition on the full disk.
> 
> Essentially, it looks like we are getting a false positive when searching
> for the device on which the root partition resides, and I think maybe if
> we add that extra sanity check of making sure the extra digit is a letter
> between a and p instead of a numerical digit like 2 we would correctly
> detect dk12 as the root device on my system instead of getting dk1 as a
> false positive.

So, as long as there is no funny business with byte order and "endianness"
with "booted_partition", I think the additional sanity check could be just
a two-line addition between the statement when we compute "booted_partition"
and the DPRINTF statement that logs the value of "booted_partition":

 			if (booted_partition & 0xfffffff0)
 				continue;

This should work because we would expect the booted partition's value
to be an unsigned integer less than 0xf. If any of the higher order
28 bits of "booted_partition" has a 1, then we know we do not have a
valid partition index between 0 and 15 so we should bail out and try
the next devname. Of course this also assumes "booted_partition" is
declared as an uint32_t, or equivalent. I could not find its declaration
in arch/xen/xen/xen_machdep.c and presumably it is declared in a header
somewhere.

Chuck Zmudzinski


Home | Main Index | Thread Index | Old Index