Subject: Re: Xen3 + VLANs + multiple DOM0s cause loss of connectivity?
To: Manuel Bouyer <bouyer@antioche.eu.org>
From: Johan Ihren <johani@johani.org>
List: port-xen
Date: 01/28/2008 16:08:19
Hi Manuel,

>>> Basically, 802.1q paquets in dom0 are not routed to the bridge  
>>> interface but
>>> to the vlan interfaces, so these packets can't make it up to the  
>>> domUs.
>>
>> Umm. There is confusion here, probably mine. I have lots of 802.1q
>> packets that go just fine across the bridge interface between DOMUs  
>> in
>> the same DOM0, and they most certainly make it up to the DOMUs. What
>> the packets don't do is go across the physical switch (between DOM0s)
>> that the DOM0 bridge device is connected to. So I have to challenge
>> the assertion that the packets are not routed to the DOM0 bridge
>> interface.
>
> It may depend on which interfaces have vlan(4) attached to.

Umm, in my case all vlan(4) interfaces are attached to the single  
xennet(4) that each DOMU has. No vlan(4)s on the DOM0, never more than  
one xennet(4) per DOMU. But usually there are several vlan(4) per DOMU.

Also, I've now tested your suggestion of removing "pseudo-device vlan"  
from the DOM0 config. Unfortunately it didn't work. No change.

>>> The way to do this is to have the vlan interfaces in dom0 only,  
>>> connect
>>> one bridge to each vlan and have in the domU one vif per vlan you  
>>> need to
>>> connect to.
>>
>> Doesn't work for me as I need to be able to dynamically affect
>> topology from inside the DOMUs. I.e. I implement nomadic behaviour by
>> having DOMUs change their VLAN tag. And on occasion I have several
>> dozen VLANs. There's no way I can do that with bridges and bunches of
>> xennets.
>
> Note that you can dynamically create/delete xennet from the dom0  
> with Xen3.

How?

> But it may not help your problem.
> I have domUs attached to more than 30 vlans, and it works just fine  
> with
> one bridge and one xennet per vlan.

The reason I cannot do something like that is that I need to  
dynamically affect the topology FROM THE DOMU, i.e. doing what you do  
in the DOM0 doesn't help. I.e. what I do is basically to simulate a  
bunch of mobile devices that move around and connect to different  
networks by changing their VLAN tag. Without VLANs I just cannot  
figure out how to do this with dozens of DOMUs. At one point I was  
looking into USB-based NICs to be able to get enough interfaces to do  
one per DOMU (plus the horrible mess of dealing with all the physical  
cables) but I gave up.

As I've used more than 50 DOMUs at the same time on occasion  
complexity management is definitely a concern, which is yet another  
reason why VLANs are a perfect match (if only they worked).

>> I remember discussing this with you at a previous occasion when I was
>> trying to have communication between the DOM0(s) and the DOMUs over
>> VLANs (with very limeted success). You explained that the DOM0
>> couldn't do the right thing wrt to both dealing with bridges and vlan
>> interfaces and therefore VLANs on the DOM0 would not see the traffic
>> arriving on the same VLAN from a DOMU (i.e. the bridge gets the
>> packet, not the DOM0 vlan interface). As a consequence of that I
>> stopped using VLANs entirely on the DOM0s and moved all services into
>> yet another DOMU and that has worked just fine for a long time.
>>
>> But now, if I understand correctly, you're saying that in the  
>> conflict
>> between sending the packet to the VLAN or to the bridge the VLAN gets
>> the packet. That sounds completely contrary to what you said before
>> and not at all in line with my experience.
>
> It's been a time since I looked in details at this code.

Perhaps since last time I had trouble with this ;-)

> When I first set up these domains with lots of network interface, my  
> first idea was
> to extend xvif/xennet to properly support 802.1q tagging (i.e. allow  
> packets
> 4 bytes larger than the ethernet MTU). I looked at vlan and bridge  
> code and
> came to the conclusion that it couldn't work, but I don't remember the
> details. Especially I don't remember if the vlan would preemt packet  
> from
> bridge, or the opposite, or if it would be more random. Also the  
> vlan vs
> bridge behavior may have changed between netbsd-3 and netbsd-4, I  
> didn't
> check this either.

Ok. My conclusion is that unless this gets resolved somehow vlan(4)  
simply doesn't work for DOMUs and then suddenly DOMUs are not  
completely functional as NetBSD hosts (from a networking POV). While I  
certainly defer to your analysis for why it cannot work it is still a  
shame. On the other hand, as I have no idea whether this works for any  
other OS, it may be that this is more of a limitation of virtual  
machines in general than NetBSD in particular.

Regards,

Johan