tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: carp and routing



On Fri, 19 May 2017, Ryota Ozaki wrote:
I noticed that pinging after failback to machine 1 only failed if I had
pinged on machine 2. So I reasoned that the problem was because switches,
etc.  hadn't noticed the change. I proved this by using arping to send a
gratuitous arp reply:
arping -c 1 -A -I carp0 192.168.1.88
arping -c 1 -A -I carp1 80.x.y.20

If the above commands are run after the interface becomes a master, then it
works. Are we missing out on sending a gratuitous arp after becomg master? I
notice OpenBSD send two, the second after a small delay:
http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/ip_carp.c.diff?r1=1.127&r2=1.128&f=h

Thank you for the investigation.

I've written a patch to fix the issue in a different way from OpenBSD.
Can you try the patch?
 http://www.netbsd.org/~ozaki-r/fix-carp-garp.diff

Yay, that works perfectly.

This is my guess how the regression was introduced:
(1) DAD for IPv4 and IN_IFF_DETACHED flag were introduced (pre -7)
   - If IN_IFF_DETACHED flag is on an IP address, any packets won't
     be sent via the IP address (including GARP packets)
   - The flag is cleared by DAD that is kicked by say an event of
     a link state change
(2) The link state change handler was changed to run in softint
   (after -7)
(3) CARP was changed to use the handler (after -7)
   - This allows CARP to kick DAD and clear IN_IFF_DETACHED flag
     *eventually*
   - OTOH, by the change, some operations are executed in reverse
   - For example, CARP tries to send a GARP packet before the handler
     is executed and fails to send it

And my patch allows CARP to execute the handler directly
(not via softint) before sending a GARP packet.

OK.

The patch can be applicable to -current and is even unnecessary to -7
because it fixes a regression introduced recently.

I don't think it is as simple as that. My email below (and elaborated on
here:  http://mail-index.netbsd.org/tech-net/2017/05/15/msg006331.html )
describes a problem with -7 here you cannot route via the default gateway
without doing a "route change default" after becoming master. It appears
that with the correct gratuitous arps, -current works OK.

I guess pulling up the commit ip_carp.c,v 1.88 to -7 would fix the issue.
The commit is (3) in the above list and (2) isn't in -7 so it just fixes
the issue that CARP doesn't send GARP packets, without the regression
introduced by (3) in -current.

1.88 has already been pulled up to -7 (ticket #1420). It appears to make things neither better nor worse.

The routing problem still persists on -7, you need to run the following every time you become the master (even on first boot):

route change default `cat /etc/mygate`

On -current the routing problem has been fixed and your patch fixes the missing GARPs.

On Wed, Mar 15, 2017 at 4:15 AM, Stephen Borrill
<netbsd%precedence.co.uk@localhost> wrote:


I'm trying to set up redundant firewalls using carp(4) as detailed in
section 28.5 here:
https://www.netbsd.org/docs/guide/en/chap-carp.html

The examples ignore routing, especially setting a default gateway.

Machine 1:
carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
        enabled=0
        carp: MASTER carpdev xennet0 vhid 1 advbase 1 advskew 0
        address: 00:00:5e:00:01:01
        inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
carp1:
flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
        enabled=0
        carp: MASTER carpdev xennet1 vhid 2 advbase 1 advskew 0
        address: 00:00:5e:00:01:02
        inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63

Machine 2:
carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
        enabled=0
        carp: BACKUP carpdev xennet0 vhid 1 advbase 1 advskew 100
        address: 00:00:5e:00:01:01
        inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
carp1:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
        enabled=0
        carp: BACKUP carpdev xennet1 vhid 2 advbase 1 advskew 100
        address: 00:00:5e:00:01:02
        inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63

My first attempt just set the default gateway in /etc/mygate with just
mahcine 1 running

The routes looked OK the face of it:

Internet:
Destination        Gateway            Flags    Refs      Use    Mtu
Interface
default            80.x.y.62          UGS         -        -      -
carp1
80.x.y.0/26        link#5             UC          -        -      -
carp1
80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
carp1
127/8              127.0.0.1          UGRS        -        -  33648
lo0
127.0.0.1          127.0.0.1          UH          -        -  33648
lo0
192.168.1/24       link#4             UC          -        -      -
carp0

But it didn't work:
# ping -n 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: No route to host
ping: sendto: No route to host
^C
----8.8.8.8 PING Statistics----
2 packets transmitted, 0 packets received, 100.0% packet loss

Guessing at some sort of race condition, between setting up carp and
the
route, I added the "route add default" command to /etc/rc.local after a
sleep 5. This fixes it with a single machine. The routing table in both
cases looks identical.

I then started up the second machine and looked its routing table:
Internet:
Destination        Gateway            Flags    Refs      Use    Mtu
Interface
default            80.x.y.62          UGS         -        -      -
carp1
80.x.y.0/26        80.x.y.20          U           -        -      -
carp1
127/8              127.0.0.1          UGRS        -        -  33648
lo0
127.0.0.1          127.0.0.1          UH          -        -  33648
lo0
192.168.1/24       192.168.1.88       U           -        -      -
carp0

If I forced machine 1 down (ifconfig carp0 down;ifconfig carp1 down),
machine 2 shows its interfaces as MASTER, but again, no route to hosts
even
though MAC address of the router does appear in the routing table after
a
while:

Internet:
Destination        Gateway            Flags    Refs      Use    Mtu
Interface
default            80.x.y.62          UGS         -        -      -
carp1
80.x.y.0/26        link#5             UC          -        -      -
carp1
80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
carp1
127/8              127.0.0.1          UGRS        -        -  33648
lo0
127.0.0.1          127.0.0.1          UH          -        -  33648
lo0
192.168.1/24       link#4             UC          -        -      -
carp0
# ping -c1 80.x.y.62
PING 80.x.y.62 (80.x.y.62): 56 data bytes
64 bytes from 80.x.y.62: icmp_seq=0 ttl=255 time=0.875988 ms

----80.x.y.62 PING Statistics----
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.875988/0.875988/0.875988/0.000000 ms
# ping -c1 8.8.8.8
PING google-public-dns-a.google.com (8.8.8.8): 56 data bytes
ping: sendto: No route to host
^C
----google-public-dns-a.google.com PING Statistics----
1 packets transmitted, 0 packets received, 100.0% packet loss

A similar problem happens at failback to the master. FreeBSD and
OpenBSD
have similar problems reported too, but with no clear answers.

--
Stephen








Home | Main Index | Thread Index | Old Index