tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Fun with carp and IPv4+IPv6



Hi I am seeing unexpected lockups with carp.

Given two boxes connected to a router from the ISP:

    The router has the transit net a.b.c.65/30 and <IPv6-Prefix>::1/64

    the network we host ad statically routed to

        a.b.c.66 and <IPv6-Prefix>::8/64

So far things are fine.

The respective wm3 interface is configured like this:

Box A: wm3: flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
        ec_enabled=2<VLAN_HWTAGGING>
        description: "WAN-LINK"
        address: x--x
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
        status: active
        inet6 fe80::xxx%wm3/64 flags 0x0 scopeid 0x4
        inet6 <IPv6-Prefix>::192/64 flags 0x0

Box B: wm3: flags=0x8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
enabled=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
enabled=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
ec_capabilities=17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
        ec_enabled=2<VLAN_HWTAGGING>
        description: "WAN-LINK"
        address: x--x
media: Ethernet autoselect (1000baseT full-duplex,flowcontrol,rxpause,txpause)
        status: active
        inet6 fe80::xxx%wm3/64 flags 0x0 scopeid 0x4
        inet6 <IPv6-Prefix>::193/64 flags 0x0

The respective carp interfaces look like this:

Box A: carp3: flags=0x9843<UP,BROADCAST,RUNNING,SIMPLEX,LINK0,MULTICAST> metric 8 mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        enabled=0
        carp: MASTER carpdev wm3 vhid 4 advbase 1 advskew 5
        description: "WAN-LINK-DEFAULT"
        address: 00:00:5e:00:01:04
        status: active
        inet a.b.c.66/30 broadcast a.b.c.67 flags 0x0
        inet6 <IPv6-Prefix>::8/128 flags 0x0

Box B: carp3: flags=0x9843<UP,BROADCAST,RUNNING,SIMPLEX,LINK0,MULTICAST> metric 8 mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
        enabled=0
        carp: BACKUP carpdev wm3 vhid 4 advbase 1 advskew 10
        description: "WAN-LINK-DEFAULT"
        address: 00:00:5e:00:01:04
        status: no network
        inet a.b.c.66/30 broadcast a.b.c.67 flags 0x4<DETACHED>
        inet6 <IPv6-Prefix>::8/128 flags 0x8<DETACHED>

So far so good - Baxk A is active and providing the link to the ISP.

Now when doing a "ifconfig carp3 down" things are expected to switch.

Reality is (NetBSD 9 and 9.99.74): Box B locks up - actually the processes

get stuck in the tstile for KERNEL_LOCK.

The lock situation (thank god this box still has classic RS232 ports):

db{0}> show locks
[Locks tracked through LWPs]
Locks held by an LWP (carp6_wqinput/0):
Lock 0 (initialized at soinit)
lock address : 0xffffa3d6ab3b4080 type     :     sleep/adaptive
initialized  : 0xffffffff80a921b5
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                 10
current cpu  :                  0 last held:                  0
current lwp  : 0xffffa3d6ab22b040 last held: 0xffffa3d675e634a0
last locked* : 0xffffffff80b56604 unlocked : 0xffffffff80b5664f
owner field  : 0xffffa3d675e634a0 wait/spin:                1/0

Turnstile chain at 0xffffffff816a0940.
=> Turnstile at 0xffffa3d67ed86f38 (wrq=0xffffa3d67ed86f58, rdq=0xffffa3d67ed86f
68).
=> 0 waiting readers:
=> 10 waiting writers: 0xffffa3d6ab22b480 0xffffa3d675de4040 0xffffa3d675818540 0xffffa3d6ab210060 0xffffa3d67586e160 0xffffa3d6a3686b40 0xffffa3d675d986e0 0xff
ffa3d6a36862c0 0xffffa3d68cf401c0 0xffffa3d6a5616280


[Locks tracked through CPUs]
Locks held on CPU 0:
Lock 0 (initialized at com_attach_subr)
lock address : 0xffffa3d675ccc988 type     :               spin
initialized  : 0xffffffff8063d333
shared holds :                  0 exclusive:                  1
shares wanted:                  0 exclusive:                  0
current cpu  :                  0 last held:                  0
current lwp  : 0xffffa3d6ab22b040 last held: 0xffffa3d6ab22b040
last locked* : 0xffffffff8063c4b3 unlocked : 0xffffffff8063c37f
owner field  : 0x0000000000010800 wait/spin:                0/1

And the stack trace of the kernel lock owner is:

db{0}> bt/a ffffa3d675e634a0
trace: pid 0 lid 82 at 0xffffd40042d66b10
sleepq_block() at netbsd:sleepq_block+0x1b3
cv_wait() at netbsd:cv_wait+0x137
rt_wait_refcnt.isra.7() at netbsd:rt_wait_refcnt.isra.7+0x3c
_rt_free() at netbsd:_rt_free+0x23
rtrequest1() at netbsd:rtrequest1+0x816
rtrequest() at netbsd:rtrequest+0x3e
carp_setroute() at netbsd:carp_setroute+0x36c
carp_master_down() at netbsd:carp_master_down+0x2be
carp_proto_input_c() at netbsd:carp_proto_input_c+0x507
_carp6_proto_input() at netbsd:_carp6_proto_input+0x277
wqinput_work() at netbsd:wqinput_work+0xb2
workqueue_worker() at netbsd:workqueue_worker+0xea
db{0}>


_carp6_proto_input is the IPv6 input path. and
carp_master_down() processes the DOWN event for a master.
carp_setroute() is AF agnostic as it processes both
addess families and thus both configured addresses.
rtrequest() is called from carp_setroute() for IPv4 addresses
in the RTM_ADD case:

In sys/netinet/ip_carp.c:carp_setroute()

                        case RTM_ADD:
                                if (hr_otherif) {
                                        ifa->ifa_rtrequest = NULL;
                                        ifa->ifa_flags &= ~RTF_CONNECTED;

                                        rtrequest(RTM_ADD, ifa->ifa_addr,
ifa->ifa_addr, ifa->ifa_netmask,
                                            RTF_UP | RTF_HOST, NULL);
                                }
                                if (!hr_otherif || nr_ourif || !rt) {
                                        if (nr_ourif &&
(rt->rt_flags & RTF_CONNECTED) == 0)
----> this gets stuck rtrequest(RTM_DELETE,
ifa->ifa_addr,
ifa->ifa_addr,
ifa->ifa_netmask, 0, NULL);

                                        ifa->ifa_rtrequest = arp_rtrequest;
                                        ifa->ifa_flags |= RTF_CONNECTED;

if (rtrequest(RTM_ADD, ifa->ifa_addr, ifa->ifa_addr, ifa->ifa_netmask, 0,
                                            NULL) == 0)
ifa->ifa_flags |= IFA_ROUTE;
                                }

The delete code gets stuck in waiting for the reference count to drop to 0 - which apparently does not happen in this case. Also compiling for NET_MPSAFE does not make a difference.

Things work when no additional IPv6 addresses are configured.

The routing table state for IPv4 before the switch is:

Routing tables

Internet:
Destination        Gateway            Flags    Refs      Use Mtu Interface
...
a.b.c.64/30        a.b.c.66           U           - -      -  carp3
...
no interface route as expected.

Any ideas why we get stuck on waiting for the reference count to drop to 0?

Frank



Home | Main Index | Thread Index | Old Index