Subject: Link aggregation does not work???
To: None <tech-net@NetBSD.org>
From: =?iso-8859-1?Q?EilerTam=E1s?= <eiler_tamas@msn.com>
List: tech-net
Date: 11/20/2007 13:24:38
Hi there,

I'm not sure if this is the proper place to ask this question but don't =
know
any better ones so I'll do it here.

I'm new to NetBSD but I need to use it in order to build fault tolerant
networks where link aggregation is a must.
I'm using NetBSD 3.0.1 for this task.=20
When simulating the network in VMWare (together with Linux simulated
machines) it seems that Linux machines work fine while simulated =
machines
running NetBSD are always failing to complete their tasks.=20

This means that trying to send ECHO ICMP messages to the aggregated
interfaces of NetBSD machines is always unsuccessful. If there is no
aggregated interface configured on NetBSD machines then they are always
accessible via the physical interfaces.
Linux machines in the same network are always available via their =
aggregated
interfaces.

My questions are as follows:
- What is the "readiness level" of the link aggregation driver in NetBSD
3.0.1?
- Has NetBSD been tested in such a virtual environment? (I mean, is it
possible that it will never work in VMWare but will perform well on real
hardware?)
- Can this phenomenon be a result of misconfiguration?

I hope there is someone here who show me the way out of this =
situation...
:-)

Thank you very much in advance!
Best Regards,
Tam=E1s



P.S.: I attach some extra information that might be useful

-------------------------------------------------------------

I tested the link aggregation in NetBSD and I created the following =
virtual
network:

-----+------------------+----------------+--------------+------------
     |                  |                |              |
     | pcn2             |pcn2            |eth1          |eth7
     |                  |                |              |
 +---+------+     +-----+----+    +------+-----+    +---+--------+
 |  NetBSD1 |     | NetBSD2  |    |   Linux1   |    |   Linux2   |
 +-+------+-+     +-+------+-+    +-+--------+-+    +-+--------+-+
   |      |         |      |        |        |        |        |
   |pcn0  |pcn1     |pcn0  |pcn1    |eth2    |eth3    |eth5    |eth6
   +--+---+         +--+---+        +----+---+        +---+----+
      |                |                 |                |
  agr1|127         agr0|128         bond0|129        bond1|130
      |                |                 |                |
------+----------------+-----------------+----------------+-----------
                      vmnet 3: 172.16.8.0/24

This network is realized inside the vmware system. NetBSD1, NetBSD2, =
SUSE
Linux 1 and SUSE Linux 2 are virtual operating systems which are =
executed by
vmware. In vmware the following virtual networks are configured:

Virtual network name	Device name	Subnet address	Connected interfaces
vmnet 3	/dev/vmnet3	172.16.8.0/24	pcn0 (NetBSD1 and NetBSD2), eth3,
eth6
vmnet 4	/dev/vmnet4	192.168.73.0/24	pcn1 (NetBSD1 and NetBSD2), eth2,
eth5
vmnet 5	/dev/vmnet5	not configured	pcn2 (NetBSD1 and NetBSD2), eth1,
eth7

Each operating system has three virtual LAN cards, which are identified =
by
interface names. I configured each operating systems so as they have an
aggregated interface too. The interface names are as follows:

Operating system	Interfaces	Aggregated interface	Aggregated
address
NetBSD 1	pcn0, pcn1 ,pcn2	agr1 =3D pcn0 + pcn1	172.16.8.127
NetBSD 2	pcn0, pcn1, pcn2	agr0 =3D pcn0 + pcn1	172.16.8.128
SUSE Linux 1	eth1, eth2, eth3	bond0 =3D eth2 + eth3	172.16.8.129
SUSE Linux 2	eth5, eth6, eth7	bond1 =3D eth5 + eth6	172.16.8.130

=20


Aggregation testing between SUSE Linux 1 and NetBSD 2




=95	Configuration of SUSE Linux 1

I configured the aggregated link bond0 in SUSE Linux 1 in the following =
way:
I created /etc/sysconfig/network/ifcfg-bond0 file:

BOOTPROTO=3D'static'
BROADCAST=3D'172.16.8.255'
IPADDR=3D'172.16.8.129'
MTU=3D''
NETMASK=3D'255.255.255.0'
NETWORK=3D'172.16.8.0'
REMOTE_IPADDR=3D''
STARTMODE=3D'onboot'
BONDING_MASTER=3D'yes'
BONDING_MODULE_OPTS=3D'miimon=3D100'
BONDING_SLAVE0=3D'eth3'
BONDING_SLAVE1=3D'eth2'

bond0 is the aggregation of eth2 and eth3 and the address of bond0 is
172.16.8.129/24
To activate the configuration the network service must be restarted with =
the
following command:=20
/etc/init.d/network restart
If we execute the ifconfig =96a command, we can see that the network
configuration has changed:

bond0     Link encap:Ethernet  HWaddr 00:0C:29:06:6A:05
          inet addr:172.16.8.129  Bcast:172.16.8.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:16236 errors:0 dropped:0 overruns:0 frame:0
          TX packets:21 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1493858 (1.4 Mb)  TX bytes:1897 (1.8 Kb)

eth1      Link encap:Ethernet  HWaddr 00:0C:29:06:6A:FB
          inet addr:192.168.132.129  Bcast:192.168.132.255
Mask:255.255.255.0
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:15318 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7275 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2286026 (2.1 Mb)  TX bytes:622176 (607.5 Kb)
          Interrupt:177 Base address:0x1080

eth2      Link encap:Ethernet  HWaddr 00:0C:29:06:6A:05
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:8115 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:746704 (729.2 Kb)  TX bytes:948 (948.0 b)
          Interrupt:193 Base address:0x1480

eth3      Link encap:Ethernet  HWaddr 00:0C:29:06:6A:05
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:8121 errors:0 dropped:0 overruns:0 frame:0
          TX packets:11 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:747154 (729.6 Kb)  TX bytes:949 (949.0 b)
          Interrupt:185 Base address:0x1400

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:852 errors:0 dropped:0 overruns:0 frame:0
          TX packets:852 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:48044 (46.9 Kb)  TX bytes:48044 (46.9 Kb)




=95	Configuration of NetBSD2

First I configured the pcn0 interface:
ifconfig pcn0 172.16.8.128 netmask 0xffffff00
ifconfig =96a

pcn0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:2c
        media: Ethernet autoselect (autoselect)
        inet 172.16.8.128 netmask 0xffffff00 broadcast 172.16.8.255
        inet6 fe80::20c:29ff:feab:a82c%pcn0 prefixlen 64 scopeid 0x1
pcn1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:2c
        media: Ethernet autoselect (autoselect)
pcn2: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:40
        media: Ethernet autoselect (autoselect)
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33192
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4

Now I am able to reach the aggregated interface of SUSE Linux 1 from
NetBSD2:
ping =96n 172.16.8.129
I am able to reach the pcn0 interface of NetBSD2 from SUSE Linux 1:

ping =96n 172.16.8.128


To setup the aggregation I have to remove the pcn0 interface:=20
ifconfig pcn0 delete
I configure agr0 as the aggregation of pcn0 and pcn1, the address is
172.16.8.128/24:
ifconfig agr0 create
ifconfig agr0 agrport pcn0
ifconfig agr0 agrport pcn1
ifconfig agr0 172.16.8.128 netmask 0xffffff00
The agr manual page describes how to set up the link aggregation:
http://netbsd.gw.com/cgi-bin/man-cgi?agr++NetBSD-current
man agr

ifconfig =96a
pcn0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:2c
        media: Ethernet autoselect (autoselect)
pcn1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:2c
        media: Ethernet autoselect (autoselect)
pcn2: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:ab:a8:40
        media: Ethernet autoselect (autoselect)
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33192
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
agr0: flags=3D8842<BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        agrport: pcn0, flags=3D0x3<COLLECTING,DISTRIBUTING>
        agrport: pcn1, flags=3D0x3<COLLECTING,DISTRIBUTING>
        address: 00:0c:29:ab:a8:2c
        inet 172.16.8.128 netmask 0xffffff00 broadcast 172.16.8.255
        inet6 fe80::20c:29ff:feab:a82c%pcn0 prefixlen 64 scopeid 0x5

The agr0 interface of NetBSD2 can be reached from NetBSD 2:
ping =96n 172.16.8.128
I cannot reach SUSE Linux 1 from NetBSD2:
ping =96n 172.16.8.129

The agr0 interface of NetBSD2 cannot be reached from SUSE Linux 1:
ping =96n 172.16.8.128

If I remove the agr0 interface and configure pcn0, than the ping works =
again
both from NetBSD2 and from SUSE Linux 1:
ifconfig agr0 -agrport pcn0
ifconfig agr0 -agrport pcn1
ifconfig agr0 destroy
ifconfig pcn0 172.16.8.128 netmask 0xffffff00

The aggregated interface of NetBSD2 did not work when I connected it to =
SUSE
Linux 1. Now I am going to test what happens if I connect two SUSE =
Linux:




Aggregation testing between SUSE Linux 1 and SUSE Linux 2

The configuration of SUSE Linux 2 is similar to what is described in =
chapter
=93Configuration of SUSE Linux 1=94. Only the ip address and the =
interface names
are changed.

ifconfig =96a

bond1     Link encap:Ethernet  HWaddr 00:0C:29:DB:FD:28
          inet addr:172.16.8.130  Bcast:172.16.8.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:15902 errors:0 dropped:0 overruns:0 frame:0
          TX packets:44 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1462827 (1.3 Mb)  TX bytes:4140 (4.0 Kb)

eth5      Link encap:Ethernet  HWaddr 00:0C:29:DB:FD:28
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:7948 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:731207 (714.0 Kb)  TX bytes:2122 (2.0 Kb)
          Interrupt:193 Base address:0x1480

eth6      Link encap:Ethernet  HWaddr 00:0C:29:DB:FD:28
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:7954 errors:0 dropped:0 overruns:0 frame:0
          TX packets:22 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:731620 (714.4 Kb)  TX bytes:2018 (1.9 Kb)
          Interrupt:185 Base address:0x1400

eth7      Link encap:Ethernet  HWaddr 00:0C:29:DB:FD:1E
          inet addr:192.168.132.130  Bcast:192.168.132.255
Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:73 errors:0 dropped:0 overruns:0 frame:0
          TX packets:59 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:6162 (6.0 Kb)  TX bytes:5387 (5.2 Kb)
          Interrupt:177 Base address:0x1080

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:17848 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17848 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1115560 (1.0 Mb)  TX bytes:1115560 (1.0 Mb)



The bond0 interface of SUSE Linux 1 can be reached from SUSE Linux 2:
ping =96n 172.16.8.129
The bond1 interface of SUSE Linux 2 can be reached from SUSE Linux 1:
ping =96n 172.16.8.130

This proves that the link aggregation works fine between two SUSE Linux.
Now I am going to test the link aggregation between two NetBSD:




Aggregation testing between NetBSD1 and NetBSD2

The configuration of NetBSD1 is similar to what is described in chapter
=93Configuration of NetBSD2=94. Only the ip address and the interface =
names are
changed.

I am able to reach the pcn0 interface of NetBSD2 from NetBSD1:
ping =96n 172.16.8.128
I am able to reach the pcn0 interface of NetBSD1 from NetBSD2:
ping =96n 172.16.8.127

I remove the pcn0 setting:=20
ifconfig pcn0 delete
I configure agr1 as the aggregation of pcn0 and pcn1, the address is
172.16.8.127/24:
ifconfig agr1 create
ifconfig agr1 agrport pcn0
ifconfig agr1 agrport pcn1
ifconfig agr1 172.16.8.127 netmask 0xffffff00

ifconfig =96a

pcn0: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:d1:bb:76
        media: Ethernet autoselect (autoselect)
pcn1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:d1:bb:76
        media: Ethernet autoselect (autoselect)
pcn2: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        address: 00:0c:29:d1:bb:8a
        media: Ethernet autoselect (autoselect)
        inet 192.168.132.127 netmask 0xffffff00 broadcast =
192.168.132.255
        inet6 fe80::20c:29ff:fed1:bb8a%pcn2 prefixlen 64 scopeid 0x3
lo0: flags=3D8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33192
        inet 127.0.0.1 netmask 0xff000000
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x4
agr1: flags=3D8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        agrport: pcn0, flags=3D0x3<COLLECTING,DISTRIBUTING>
        agrport: pcn1, flags=3D0x3<COLLECTING,DISTRIBUTING>
        address: 00:0c:29:d1:bb:76
        inet 172.16.8.127 netmask 0xffffff00 broadcast 172.16.8.255
        inet6 fe80::20c:29ff:fed1:bb76%agr1 prefixlen 64 scopeid 0x5


I am able to ping the agr1 interface of NetBSD1 from NetBSD1:
ping =96n 172.16.8.127

I cannot reach NetBSD2 from NetBSD1:
ping =96n 172.16.8.128
I cannot reach NetBSD1 from NetBSD2:
ping =96n 172.16.8.127
If I remove the agr1 interface and configure pcn0, than the ping works =
again
both from NetBSD2 and from NetBSD1:
ifconfig agr1 -agrport pcn0
ifconfig agr1 -agrport pcn1
ifconfig agr1 destroy
ifconfig pcn0 172.16.8.127 netmask 0xffffff00



Conclusion

Source	Destination	Ping
NetBSD 1 without aggregation	NetBSD 2 without aggregation	OK
localhost	NetBSD 1 with aggregation	OK
NetBSD 1 without aggregation	NetBSD 2 with aggregation	FAIL
NetBSD 1 with aggregation	NetBSD 2 with aggregation	FAIL
NetBSD 1 without aggregation	SUSE Linux 1 without aggregation	OK
NetBSD 1 with aggregation	SUSE Linux 1 without aggregation	FAIL
NetBSD 1 with aggregation	SUSE Linux 1 with aggregation	FAIL
NetBSD 1 without aggregation	SUSE Linux 1 with aggregation	OK
SUSE Linux 1 without aggregation	SUSE Linux 2 without aggregation
OK
SUSE Linux 1 without aggregation	SUSE Linux 2 with aggregation	OK
SUSE Linux 1 with aggregation	SUSE Linux 2 with aggregation	OK

The link aggregation in NetBSD did not work, but it worked well in =
Linux, so
there is little chance that it is due to vmware. To be 100 percent sure, =
it
may be useful to test it with real hardware. We would need 2 pcs, each
equipped with 2 LAN cards.=20