tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: carp and routing



On Fri, May 19, 2017 at 12:35 AM, Stephen Borrill
<netbsd%precedence.co.uk@localhost> wrote:
> On Thu, 18 May 2017, Ryota Ozaki wrote:
>>
>> On Thu, May 18, 2017 at 8:44 PM, Stephen Borrill
>> <netbsd%precedence.co.uk@localhost> wrote:
>>>
>>> On Thu, 18 May 2017, Ryota Ozaki wrote:
>>>>
>>>>
>>>> I fixed an issue of CARP in -current, which is a regression between
>>>> -7 and -current, but I'm not sure the fix solves your problem
>>>> completely. Could you try the latest source code and report how the
>>>> fix changes the situation (or not).
>>>
>>>
>>>
>>> Prior to your changes I found that with -current, I could not send any
>>> packets at all even on the LAN (a regression from -7). With your changes
>>> I
>>> find:
>>> 1) When first booted, machine 1 (Master) can send packets on LAN and via
>>> default route (regression fixed)
>>> 2) Upon failover to machine 2, machine 2 can now send packets via default
>>> route without having to do a "route change default" (fixed problem)
>>
>>
>> Good.
>>
>>> 3) Upon failback to machine 1, machine 1 can no longer send packets
>>> (regression still present)
>
>
> I noticed that pinging after failback to machine 1 only failed if I had
> pinged on machine 2. So I reasoned that the problem was because switches,
> etc.  hadn't noticed the change. I proved this by using arping to send a
> gratuitous arp reply:
> arping -c 1 -A -I carp0 192.168.1.88
> arping -c 1 -A -I carp1 80.x.y.20
>
> If the above commands are run after the interface becomes a master, then it
> works. Are we missing out on sending a gratuitous arp after becomg master? I
> notice OpenBSD send two, the second after a small delay:
> http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/netinet/ip_carp.c.diff?r1=1.127&r2=1.128&f=h

Thank you for the investigation.

I've written a patch to fix the issue in a different way from OpenBSD.
Can you try the patch?
  http://www.netbsd.org/~ozaki-r/fix-carp-garp.diff

This is my guess how the regression was introduced:
(1) DAD for IPv4 and IN_IFF_DETACHED flag were introduced (pre -7)
    - If IN_IFF_DETACHED flag is on an IP address, any packets won't
      be sent via the IP address (including GARP packets)
    - The flag is cleared by DAD that is kicked by say an event of
      a link state change
(2) The link state change handler was changed to run in softint
    (after -7)
(3) CARP was changed to use the handler (after -7)
    - This allows CARP to kick DAD and clear IN_IFF_DETACHED flag
      *eventually*
    - OTOH, by the change, some operations are executed in reverse
    - For example, CARP tries to send a GARP packet before the handler
      is executed and fails to send it

And my patch allows CARP to execute the handler directly
(not via softint) before sending a GARP packet.

>
>> Could you show me the results of the following command on machine 1?
>> - ifconfig carp0
>> - ifconfig xennet0
>> - netstat -nr -f inet
>> - arp -na
>
>
> I'll skip these because of the above arp tests.
>
>>> I tried to apply your if_arp.c patch to -7. but the code is just too
>>> different.
>>
>>
>> The patch can be applicable to -current and is even unnecessary to -7
>> because it fixes a regression introduced recently.
>
>
> I don't think it is as simple as that. My email below (and elaborated on
> here:  http://mail-index.netbsd.org/tech-net/2017/05/15/msg006331.html )
> describes a problem with -7 here you cannot route via the default gateway
> without doing a "route change default" after becoming master. It appears
> that with the correct gratuitous arps, -current works OK.

I guess pulling up the commit ip_carp.c,v 1.88 to -7 would fix the issue.
The commit is (3) in the above list and (2) isn't in -7 so it just fixes
the issue that CARP doesn't send GARP packets, without the regression
introduced by (3) in -current.

  ozaki-r

>
>
>>>> On Wed, Mar 15, 2017 at 4:15 AM, Stephen Borrill
>>>> <netbsd%precedence.co.uk@localhost> wrote:
>>>>>
>>>>>
>>>>> I'm trying to set up redundant firewalls using carp(4) as detailed in
>>>>> section 28.5 here:
>>>>> https://www.netbsd.org/docs/guide/en/chap-carp.html
>>>>>
>>>>> The examples ignore routing, especially setting a default gateway.
>>>>>
>>>>> Machine 1:
>>>>> carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>         enabled=0
>>>>>         carp: MASTER carpdev xennet0 vhid 1 advbase 1 advskew 0
>>>>>         address: 00:00:5e:00:01:01
>>>>>         inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
>>>>> carp1:
>>>>> flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>         enabled=0
>>>>>         carp: MASTER carpdev xennet1 vhid 2 advbase 1 advskew 0
>>>>>         address: 00:00:5e:00:01:02
>>>>>         inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63
>>>>>
>>>>> Machine 2:
>>>>> carp0:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>         enabled=0
>>>>>         carp: BACKUP carpdev xennet0 vhid 1 advbase 1 advskew 100
>>>>>         address: 00:00:5e:00:01:01
>>>>>         inet 192.168.1.88 netmask 0xffffff00 broadcast 192.168.1.255
>>>>> carp1:  flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
>>>>>         capabilities=2800<TCP4CSUM_Tx,UDP4CSUM_Tx>
>>>>>         enabled=0
>>>>>         carp: BACKUP carpdev xennet1 vhid 2 advbase 1 advskew 100
>>>>>         address: 00:00:5e:00:01:02
>>>>>         inet 80.x.y.20 netmask 0xffffffc0 broadcast 80.71.28.63
>>>>>
>>>>> My first attempt just set the default gateway in /etc/mygate with just
>>>>> mahcine 1 running
>>>>>
>>>>> The routes looked OK the face of it:
>>>>>
>>>>> Internet:
>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>> Interface
>>>>> default            80.x.y.62          UGS         -        -      -
>>>>> carp1
>>>>> 80.x.y.0/26        link#5             UC          -        -      -
>>>>> carp1
>>>>> 80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
>>>>> carp1
>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>> lo0
>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>> lo0
>>>>> 192.168.1/24       link#4             UC          -        -      -
>>>>> carp0
>>>>>
>>>>> But it didn't work:
>>>>> # ping -n 8.8.8.8
>>>>> PING 8.8.8.8 (8.8.8.8): 56 data bytes
>>>>> ping: sendto: No route to host
>>>>> ping: sendto: No route to host
>>>>> ^C
>>>>> ----8.8.8.8 PING Statistics----
>>>>> 2 packets transmitted, 0 packets received, 100.0% packet loss
>>>>>
>>>>> Guessing at some sort of race condition, between setting up carp and
>>>>> the
>>>>> route, I added the "route add default" command to /etc/rc.local after a
>>>>> sleep 5. This fixes it with a single machine. The routing table in both
>>>>> cases looks identical.
>>>>>
>>>>> I then started up the second machine and looked its routing table:
>>>>> Internet:
>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>> Interface
>>>>> default            80.x.y.62          UGS         -        -      -
>>>>> carp1
>>>>> 80.x.y.0/26        80.x.y.20          U           -        -      -
>>>>> carp1
>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>> lo0
>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>> lo0
>>>>> 192.168.1/24       192.168.1.88       U           -        -      -
>>>>> carp0
>>>>>
>>>>> If I forced machine 1 down (ifconfig carp0 down;ifconfig carp1 down),
>>>>> machine 2 shows its interfaces as MASTER, but again, no route to hosts
>>>>> even
>>>>> though MAC address of the router does appear in the routing table after
>>>>> a
>>>>> while:
>>>>>
>>>>> Internet:
>>>>> Destination        Gateway            Flags    Refs      Use    Mtu
>>>>> Interface
>>>>> default            80.x.y.62          UGS         -        -      -
>>>>> carp1
>>>>> 80.x.y.0/26        link#5             UC          -        -      -
>>>>> carp1
>>>>> 80.x.y.62          c4:71:fe:65:53:61  UHLc        -        -      -
>>>>> carp1
>>>>> 127/8              127.0.0.1          UGRS        -        -  33648
>>>>> lo0
>>>>> 127.0.0.1          127.0.0.1          UH          -        -  33648
>>>>> lo0
>>>>> 192.168.1/24       link#4             UC          -        -      -
>>>>> carp0
>>>>> # ping -c1 80.x.y.62
>>>>> PING 80.x.y.62 (80.x.y.62): 56 data bytes
>>>>> 64 bytes from 80.x.y.62: icmp_seq=0 ttl=255 time=0.875988 ms
>>>>>
>>>>> ----80.x.y.62 PING Statistics----
>>>>> 1 packets transmitted, 1 packets received, 0.0% packet loss
>>>>> round-trip min/avg/max/stddev = 0.875988/0.875988/0.875988/0.000000 ms
>>>>> # ping -c1 8.8.8.8
>>>>> PING google-public-dns-a.google.com (8.8.8.8): 56 data bytes
>>>>> ping: sendto: No route to host
>>>>> ^C
>>>>> ----google-public-dns-a.google.com PING Statistics----
>>>>> 1 packets transmitted, 0 packets received, 100.0% packet loss
>>>>>
>>>>> A similar problem happens at failback to the master. FreeBSD and
>>>>> OpenBSD
>>>>> have similar problems reported too, but with no clear answers.
>>>>>
>>>>> --
>>>>> Stephen
>>>>>
>>>>
>>>
>>
>


Home | Main Index | Thread Index | Old Index