Subject: Xen3 + VLANs + multiple DOM0s cause loss of connectivity?
To: port-xen <port-xen@netbsd.org>
From: Johan Ihren <johani@johani.org>
List: port-xen
Date: 01/24/2008 12:26:52
Hi,

I have a complex Xen setup that utilizes VLANs to instantiate a virtual
network topology on top of a physical infrastructure with several DOM0s
connected through a single physical switch.

This has worked just fine with Xen2 for a long time. But now I've  
upgraded
to Xen3 and unfortunately I've started having severe problems.

This doesn't seem to work:

* Two DOMUs running on separate DOM0s, communicating over IPv6 inside  
a VLAN.
   I.e. a setup like the following doesn't work for me:

         domu1# ifconfig vlan0 create
         domu1# ifconfig vlan0 vlan 10 vlanif xennet0
         domu1# ifconfig vlan0 inet6 2001:1:1::1 prefixlen 32
         domu2# ifconfig vlan0 create
         domu2# ifconfig vlan0 vlan 10 vlanif xennet0
         domu2# ifconfig vlan0 inet6 2001:1:1::2 prefixlen 32
         domu2# ping6 2001:1:1::1                **** Doesn't work

   Note that if the DOMUs are on the *same* DOM0 then everything is ok.

* Two DOMUs running on separate DOM0s, communicating over IPv4 inside  
a VLAN
   where at least one of the DOMUs is using *several* IP v4 addresses  
(i.e.
   IP aliases) on the same interface.

         domu1# ifconfig vlan0 create
         domu1# ifconfig vlan0 vlan 10 vlanif xennet0
         domu1# ifconfig vlan0 inet 10.1.0.1/24
         domu1# ifconfig vlan0 inet 10.1.0.2/24 alias
         domu2# ifconfig vlan0 create
         domu2# ifconfig vlan0 vlan 10 vlanif xennet0
         domu2# ifconfig vlan0 inet 10.1.0.10/24
         domu2# ping -n 10.1.0.1                 **** May work
         domu2# ping -n 10.1.0.2                 **** Usually doesn't  
work

I realize that "usually doesn't work" is a vague description. Sorry.
But if I leave a ping runnning then I really see spots of connectivity
where a bunch of pings get through (5-10 perhaps) and then there is
complete silence again for a noticable time. In other cases there's
just no connectivity at all.

In the "several IP aliases" case I can sometimes "kick things into  
working"
by sending a ping in the opposite direction, but then only the IP  
address
that was used as the source of the ping starts to work, not the others.

Outside of VLANs (i.e. when configuring IPv4 and IPv6 addresses
directly on the xennetN then everything works just fine. VLANs
configured on the DOM0 also works fine. It is just the combination of
Xen3 + DOMU + VLANs that causes problems.

This is a automated system and all configs, etc, are automatically
generated with unique MAC adddresses etc. I really dont think the
problems are due to config errors on my part (FWIW).

Has anyone else noticed anything like this? If someone could please  
verify
the problem (to remove the possibility of this being me losing my mind)
it would be much appreciated.

Regards,

Johan