Subject: Re: System hangs during daily jobs
To: Jeff Rizzo <riz@tastylime.net>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 06/09/2006 19:49:42
On Fri, Jun 09, 2006 at 07:24:23AM -0700, Jeff Rizzo wrote:
> I started switching a system which had been primarily a low-volume
> nameserver/mail relay to being a xen host a couple days ago, and it's
> now hung overnight two nights in a row.
>
> The setup is a dom0 (still performing the dns/mail relay, since I
> haven't migrated it yet) running a -current kernel of a couple days ago
> on a 3.0_STABLE userland. The domU is still doing little (I'm setting
> it up as a web server, but no real traffic yet). What happened last
> night is that a "top" running in the domU hung at about 03:19, but
> there's evidence that the dom0 was doing _some_ processing until 07:00,
> although at that time it wouldn't respond on console. (As an aside, I'm
> unable to get dom0 into ddb - the machine has a serial console, if that
> matters)
Please note (in case you didn't) that the magic string is +++++ on a
Xen kernel, and not break (the serial console is managed by xen, so we
can't see the break).
>
> After a few minutes of poking around, I did the ^A^A^A thing to switch
> to the Xen console, and rebooted.
>
> A few other details:
>
> - the nightly cron jobs on the two "machines" run at the same time - I
> haven't tweaked that yet.
> - the domU's disk is provided by two files in the filesystem of the dom0
> - the dom0 has 192M of RAM, the domU 128M. The physical machine has two
> CPUs (PIII-1GHz) and 1G RAM
Could you try running in UP mode (I think it's 'nosmp' on the xen command
line, or something like that) and see if it helps ? Next thing to try
is to run a SMP Xen, but with both domains forced on cpu 0.
Also, you could try using 'q' after ^A^A^A, to see the state of
domains, and other usefull infos (the NetBSD dom0 kernel should print
a few things too, it can be an indication on how hard it's hung)
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--