Subject: Re: xennetback_ifstart: no mcl_pages crashes XEN2-dom0
To: Damian Lubosch <dl@xiqit.de>
From: Manuel Bouyer <bouyer@antioche.eu.org>
List: port-xen
Date: 06/03/2007 19:00:16
On Wed, May 30, 2007 at 07:39:34PM +0200, Damian Lubosch wrote:
> Hello!
> 
> I am using NetBSD4.0Beta as XEN2-Dom0 (from 10/4/06) with six 
> NetBSD3.0.1 DomUs on an AMD 3700+ server, VIA Chipset with 1GB of RAM 
> and Hardware Raid.
> 
> From time to time I get the following error that forces me to reboot my 
> Dom0
> 
> 
> May 30 10:07:42 pless /netbsd: xennetback: got only 29 new mcl pages
> May 30 10:07:47 pless /netbsd: xennetback: got only 9 new mcl pages
> May 30 10:07:48 pless /netbsd: xennetback: got only 3 new mcl pages
> May 30 10:07:48 pless /netbsd: xennetback: got only 1 new mcl pages
> May 30 10:07:48 pless /netbsd: xennetback: can't get new mcl pages (0)
> May 30 10:07:48 pless /netbsd: xennetback_ifstart: no mcl_pages
> And the last two lines repeat 1000+ times....
> 
> What are those mcl pages?

Memory that dom0 reclaims from the hypervisor to remplace pages that
we gave to the remote domains when sending packets

> 
> On the domU machines I let run a lot of RSync cronjobs copying files 
> from another sibling-PC with very similar configuration. Some of the 
> DomUs share directories with nfs. In fact this machine is my 
> hot-standby-backup-server. The master server does not complain about 
> this problem. (It only complains about ..."/netbsd: xbdback: domain 6 
> sending excessively fragmented I/O" but it's not that bad I think)
> 
> 
> First I thought that it could have to do with a buggy network driver for 
> the onboard Realtec NIC but now there is an Intel NIC built in and the 
> problem still persists.
> 
> My system memory is divided as follows:
> 
> pless# xm list
> Name              Id  Mem(MB)  CPU  State  Time(s)  Console
> Domain-0           0       63    0  r----    499.4
> ap-php4            3      255    0  -b---     26.0    9603
> ap-php5            4      127    0  -b---     21.8    9604
> dbserver           5      127    0  -b---    101.3    9605
> mailserver         7      169    0  -b---     60.5    9607
> ns1                2      127    0  -b---     20.4    9602
> router             1      127    0  -b---    257.4    9601
> 
> 
> pless# xm info
> system                 : NetBSD
> host                   : pless
> release                : 4.0_BETA
> version                : NetBSD 4.0_BETA (XEN2_DOM0) #0: Wed Apr 18 
> 15:14:53 CEST 2007  root@pless:/xen/usr/src/sys/arch/i386/compile/XEN2_DOM0
> machine                : i386
> cores                  : 1
> hyperthreads_per_core  : 1
> cpu_mhz                : 2199
> memory                 : 1022
> free_memory            : 2
> 
> 
> 
> I use an self-built kernel for this NetBSD version with more SHM pages 
> (8192 instead of 2048). I hoped that could solve the problem but it did not.
> 
> options         SHMMAXPGS=8192  # 2048 pages is the default

No, it's completely different.

> 
> 
> I read on the list something about XEN-DOM0 needing 32MB for itself. 
> (Date of posting: 4/15/2005 6pm from Manuel Bouyer) Does it mean that I 
> have to increase my 2MB of free memory to 32? That's actually what I'd 
> try next...

I strongly suspect that there's not enough free ram for the hypervisor.
I think the 32MB is not counted as part of "xm info", but it's something
that can't ever be allocated to domains anyway, it's internal to the
hypervisor. You'll also need some free pages for the communications between
domains.

> 
> Another idea which will be quite time consuming and I would like to avoid:
> Is it worth it to update to a newer Beta-version of NetBSD4 with a newer 
> Kernel?

I don't think it will help.

-- 
Manuel Bouyer <bouyer@antioche.eu.org>
     NetBSD: 26 ans d'experience feront toujours la difference
--