Port-sparc64 archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: Bug in loadfile_elf32.c?



The OpenBoot firmware (IEEE 1275) on SPARCs & UltraSPARCs only initializes as much RAM as the NVRAM parameter variables specify, not all RAM that is detected. It tends to be a tradeoff - zeroing all RAM when you’ve got gigabytes “takes a long time” and some places want fast (re)booting, so “do just enough for the kernel” has long been common practice.

In principle, once the OS kernel is booted (whatever it is), all parity/ECC RAM should be initialized (if there’s random bits in a RAM location, reading it might raise an exception), but it can be done by “lazy evaluation” (as needed) rather than all-at-once in a blocking fashion.

	Erik <fair%netbsd.org@localhost>


> On Aug 31, 2016, at 00:47, Mark Cave-Ayland <mark.cave-ayland%ilande.co.uk@localhost> wrote:
> 
> Hi all,
> 
> I've recently been working on a patchset that changes the way in which
> the OpenBIOS client contexts are constructed, and was quite surprised to
> see a very minor change to the stack address was causing my NetBSD 6
> test image to regress under qemu-system-sparc64.
> 
> Digging in further: what I could see with this code change was that the
> text segment of the kernel was no longer being mapped at boot, and so
> when jump_to_kernel() in arch/sparc/stand/ofwboot/boot.c tried to pass
> control over to the loaded kernel, it would fault straight away.
> 
> After several hours with the debugger I eventually found out that the
> problem with the change applied to OpenBIOS was that marks[MARK_DATA]
> was being set to 0x8 rather than 0x1800000 which was causing
> sparc64_finalize_tlb_sun4u() to skip the kernel mapping since
> (dtlb_store[i].te_va >= data_va) was always true and so we'd drop out of
> the loop via the continue and never map the text segment of the kernel.
> 
> Eventually I traced the source back to arch/sparc/stand/ofwboot/boot.c
> and figured out what was happening. In start_kernel() the marks array is
> defined on the heap like this:
> 
> u_long marks[MARK_MAX];
> 
> When the patch to OpenBIOS was applied, the stack address was changed to
> point to an area of memory that had already been used to build a
> previous client context, and already contained junk data. It so happened
> that marks[MARK_DATA] was not set to 0 by default which meant we were
> never triggering the logic below in loadfile_elf32.c to update it,
> leaving it set to a random value:
> 
> loadseg:
> 	if (marks[MARK_DATA] == 0 && IS_DATA(phdr[i]))
> 		marks[MARK_DATA] = LOADADDR(phdr[i].p_vaddr);
> 
> I believe the bug here is that loadfile_elf32.c should set
> marks[MARK_DATA] = 0 before the main segment loading loop. Fortunately
> I'm fairly sure that this isn't an issue on real SPARC hardware since
> OBP sets all physical RAM to zero on boot (except retained segments),
> however I could see that this could catch out other archs after a reboot
> where RAM contents may not necessarily be zero.
> 
> In the meantime I'll see if I can figure out a workaround in my OpenBIOS
> patches to make sure that the stack is set to zero when executing the
> client image to work around this...
> 
> 
> ATB,
> 
> Mark.
> 




Home | Main Index | Thread Index | Old Index