Subject: Re: recent dom0 kernels reboot on loading?
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Daniel Carosone <dan@geek.com.au>
List: port-xen
Date: 08/22/2007 17:37:11
--lkTb+7nhmha7W+c3
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Aug 21, 2007 at 09:52:46AM +0200, Manuel Bouyer wrote:
> debug will probably give usefull hints, but you may need a serial console
> for this as I'm not sure it'll print anything usefull on VGA (I only use
> serial consoles so I don't know :)

So, I tried a bunch of boot-option variations in an attempt to=20
provoke something different or narrow down something that might
be causing it.

No variations with xen boot options had any effect: nosmp, noapic,
nolapic, hap, etc.  Turning on extra loglvl, or using the xendebug.gz
VMM didn't reveal anything new either.

For all of these options, the dom0 kernel is loaded and the memory map
is printed, the debug log levels for xen and guest are printed, and
that's it..  a few second's pause then back to the bios splash screen.
If I boot xen with "noreboot", there is no reboot - but there is no
further information or messages either. It just seems to stop at the
same point.  Nothing is ever printed that I can see from the NetBSD
kernel - no Copyright or anything more.

This seems suggestive that there's some kind of error when starting
the dom0 kernel that Xen detects and triggers the reboot. However=20
I'm just so far unable to provoke any further diagnostics about what
that error might be.

If the problem is with the kernel, it's not my kernel config: I built=20
a XEN3_DOM0 kernel and it behaves identically.  I tried changing
compiler settings (optimisation, -march=3D, etc) also with no result.

I'm travelling at the moment, with limited bandwidth and very limited
hardware to test any other permutations to narrow down further -
problems with my source, tools or something specific to the
hardware/bios/something on the host.

I've uploaded this XEN3_DOM0 kernel to ftp:/pub/NetBSD/misc/dan -
either you'll be able to reproduce the problem with this kernel, and
we can look at problems with source or toolchain, or you can't and
there's something odd about my machine or memory layout or bios or
=2E.. *shrug*.

That being said, I'm pretty confident the problem is kernel related,
in that my netbsd_dom0.old still works fine for dom0 and HVM guests.
Thankfully, it also seems that -current userland works well enough on
this old kernel.

--
Dan.
--lkTb+7nhmha7W+c3
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (NetBSD)

iD8DBQFGy+emEAVxvV4N66cRAoupAKDAeAIT2vQjRoNcXloM3LJAdyLRnACeIBbC
MYVOlicmSKu0PxyPuF6Gf/c=
=gSxG
-----END PGP SIGNATURE-----

--lkTb+7nhmha7W+c3--