tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: carp and routing



On Fri, May 19, 2017 at 6:01 PM, Ryota Ozaki <ozaki-r%netbsd.org@localhost> wrote:
> On Fri, May 19, 2017 at 5:21 PM, Stephen Borrill
> <netbsd%precedence.co.uk@localhost> wrote:
>> On Fri, 19 May 2017, Ryota Ozaki wrote:
>>>>
>>>> I noticed that pinging after failback to machine 1 only failed if I had
>>>> pinged on machine 2. So I reasoned that the problem was because switches,
>>>> etc.  hadn't noticed the change. I proved this by using arping to send a
>>>> gratuitous arp reply:
>>>> arping -c 1 -A -I carp0 192.168.1.88
>>>> arping -c 1 -A -I carp1 80.x.y.20
>>>>
>>>> If the above commands are run after the interface becomes a master, then
>>>> it
>>>> works. Are we missing out on sending a gratuitous arp after becomg
>>>> master? I
>>>> notice OpenBSD send two, the second after a small delay:
>>>>
>>>> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/ip_carp.c.diff?r1=1.127&r2=1.128&f=h
>>>
>>>
>>> Thank you for the investigation.
>>>
>>> I've written a patch to fix the issue in a different way from OpenBSD.
>>> Can you try the patch?
>>>  http://www.netbsd.org/~ozaki-r/fix-carp-garp.diff
>>
>>
>> Yay, that works perfectly.
>
> Good :)
>
>>
>>> This is my guess how the regression was introduced:
>>> (1) DAD for IPv4 and IN_IFF_DETACHED flag were introduced (pre -7)

Oops. This was wrong. DAD for IPv4 was implemented after branching -7.
So Roy is perhaps not related to the issue.

  ozaki-r

>>>    - If IN_IFF_DETACHED flag is on an IP address, any packets won't
>>>      be sent via the IP address (including GARP packets)
>>>    - The flag is cleared by DAD that is kicked by say an event of
>>>      a link state change
>>> (2) The link state change handler was changed to run in softint
>>>    (after -7)
>>> (3) CARP was changed to use the handler (after -7)
>>>    - This allows CARP to kick DAD and clear IN_IFF_DETACHED flag
>>>      *eventually*
>>>    - OTOH, by the change, some operations are executed in reverse
>>>    - For example, CARP tries to send a GARP packet before the handler
>>>      is executed and fails to send it
>>>
>>> And my patch allows CARP to execute the handler directly
>>> (not via softint) before sending a GARP packet.
>>
>>
>> OK.
>>
>>>>> The patch can be applicable to -current and is even unnecessary to -7
>>>>> because it fixes a regression introduced recently.
>>>>
>>>>
>>>> I don't think it is as simple as that. My email below (and elaborated on
>>>> here:  http://mail-index.netbsd.org/tech-net/2017/05/15/msg006331.html )
>>>> describes a problem with -7 here you cannot route via the default gateway
>>>> without doing a "route change default" after becoming master. It appears
>>>> that with the correct gratuitous arps, -current works OK.
>>>
>>>
>>> I guess pulling up the commit ip_carp.c,v 1.88 to -7 would fix the issue.
>>> The commit is (3) in the above list and (2) isn't in -7 so it just fixes
>>> the issue that CARP doesn't send GARP packets, without the regression
>>> introduced by (3) in -current.
>>
>>
>> 1.88 has already been pulled up to -7 (ticket #1420). It appears to make
>> things neither better nor worse.
>>
>> The routing problem still persists on -7, you need to run the following
>> every time you become the master (even on first boot):
>>
>> route change default `cat /etc/mygate`
>>
>> On -current the routing problem has been fixed and your patch fixes the
>> missing GARPs.
>
> Hmm, I have no idea for -7 for now. Roy, do you have any ideas on the issue?
>
>   ozaki-r
>
>>
>>
>>>>>>> On Wed, Mar 15, 2017 at 4:15 AM, Stephen Borrill
>>>>>>> <netbsd%precedence.co.uk@localhost> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm trying to set up redundant firewalls using carp(4) as detailed in
>>>>>>>> section 28.5 here:
>>>>>>>> https://www.netbsd.org/docs/guide/en/chap-carp.html
>>>>>>>>
>>>>>>>> The examples ignore routing, especially setting a default gateway.
>>>>>>>>
>>>>>>>> Machine 1:
>>>>>>>> carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>>>>         enabled=0
>>>>>>>>         carp: MASTER carpdev xennet0 vhid 1 advbase 1 advskew 0
>>>>>>>>         address: 00:00:5e:00:01:01
>>>>>>>>         inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
>>>>>>>> carp1:
>>>>>>>> flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>>>>         enabled=0
>>>>>>>>         carp: MASTER carpdev xennet1 vhid 2 advbase 1 advskew 0
>>>>>>>>         address: 00:00:5e:00:01:02
>>>>>>>>         inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63
>>>>>>>>
>>>>>>>> Machine 2:
>>>>>>>> carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>>>>         enabled=0
>>>>>>>>         carp: BACKUP carpdev xennet0 vhid 1 advbase 1 advskew 100
>>>>>>>>         address: 00:00:5e:00:01:01
>>>>>>>>         inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
>>>>>>>> carp1:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>>>>         enabled=0
>>>>>>>>         carp: BACKUP carpdev xennet1 vhid 2 advbase 1 advskew 100
>>>>>>>>         address: 00:00:5e:00:01:02
>>>>>>>>         inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63
>>>>>>>>
>>>>>>>> My first attempt just set the default gateway in /etc/mygate with
>>>>>>>> just
>>>>>>>> mahcine 1 running
>>>>>>>>
>>>>>>>> The routes looked OK the face of it:
>>>>>>>>
>>>>>>>> Internet:
>>>>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>>>>> Interface
>>>>>>>> default            80.x.y.62          UGS         -        -      -
>>>>>>>> carp1
>>>>>>>> 80.x.y.0/26        link#5             UC          -        -      -
>>>>>>>> carp1
>>>>>>>> 80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
>>>>>>>> carp1
>>>>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>>>>> lo0
>>>>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>>>>> lo0
>>>>>>>> 192.168.1/24       link#4             UC          -        -      -
>>>>>>>> carp0
>>>>>>>>
>>>>>>>> But it didn't work:
>>>>>>>> # ping -n 8.8.8.8
>>>>>>>> PING 8.8.8.8 (8.8.8.8): 56 data bytes
>>>>>>>> ping: sendto: No route to host
>>>>>>>> ping: sendto: No route to host
>>>>>>>> ^C
>>>>>>>> ----8.8.8.8 PING Statistics----
>>>>>>>> 2 packets transmitted, 0 packets received, 100.0% packet loss
>>>>>>>>
>>>>>>>> Guessing at some sort of race condition, between setting up carp and
>>>>>>>> the
>>>>>>>> route, I added the "route add default" command to /etc/rc.local after
>>>>>>>> a
>>>>>>>> sleep 5. This fixes it with a single machine. The routing table in
>>>>>>>> both
>>>>>>>> cases looks identical.
>>>>>>>>
>>>>>>>> I then started up the second machine and looked its routing table:
>>>>>>>> Internet:
>>>>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>>>>> Interface
>>>>>>>> default            80.x.y.62          UGS         -        -      -
>>>>>>>> carp1
>>>>>>>> 80.x.y.0/26        80.x.y.20          U           -        -      -
>>>>>>>> carp1
>>>>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>>>>> lo0
>>>>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>>>>> lo0
>>>>>>>> 192.168.1/24       192.168.1.88       U           -        -      -
>>>>>>>> carp0
>>>>>>>>
>>>>>>>> If I forced machine 1 down (ifconfig carp0 down;ifconfig carp1 down),
>>>>>>>> machine 2 shows its interfaces as MASTER, but again, no route to
>>>>>>>> hosts
>>>>>>>> even
>>>>>>>> though MAC address of the router does appear in the routing table
>>>>>>>> after
>>>>>>>> a
>>>>>>>> while:
>>>>>>>>
>>>>>>>> Internet:
>>>>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>>>>> Interface
>>>>>>>> default            80.x.y.62          UGS         -        -      -
>>>>>>>> carp1
>>>>>>>> 80.x.y.0/26        link#5             UC          -        -      -
>>>>>>>> carp1
>>>>>>>> 80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
>>>>>>>> carp1
>>>>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>>>>> lo0
>>>>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>>>>> lo0
>>>>>>>> 192.168.1/24       link#4             UC          -        -      -
>>>>>>>> carp0
>>>>>>>> # ping -c1 80.x.y.62
>>>>>>>> PING 80.x.y.62 (80.x.y.62): 56 data bytes
>>>>>>>> 64 bytes from 80.x.y.62: icmp_seq=0 ttl=255 time=0.875988 ms
>>>>>>>>
>>>>>>>> ----80.x.y.62 PING Statistics----
>>>>>>>> 1 packets transmitted, 1 packets received, 0.0% packet loss
>>>>>>>> round-trip min/avg/max/stddev = 0.875988/0.875988/0.875988/0.000000
>>>>>>>> ms
>>>>>>>> # ping -c1 8.8.8.8
>>>>>>>> PING google-public-dns-a.google.com (8.8.8.8): 56 data bytes
>>>>>>>> ping: sendto: No route to host
>>>>>>>> ^C
>>>>>>>> ----google-public-dns-a.google.com PING Statistics----
>>>>>>>> 1 packets transmitted, 0 packets received, 100.0% packet loss
>>>>>>>>
>>>>>>>> A similar problem happens at failback to the master. FreeBSD and
>>>>>>>> OpenBSD
>>>>>>>> have similar problems reported too, but with no clear answers.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Stephen
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>


Home | Main Index | Thread Index | Old Index