Subject: kern/25416: parent interface pakets being sent out via vlanif (-> network outage)
To: None <gnats-bugs@gnats.NetBSD.org>
From: Frank Kardel <kardel@pip.acrys.com>
List: netbsd-bugs
Date: 05/01/2004 10:22:11
>Number:         25416
>Category:       kern
>Synopsis:       parent interface pakets incorrectly being sent out via vlanif (-> network outage)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat May 01 08:23:00 UTC 2004
>Closed-Date:
>Last-Modified:
>Originator:     Frank Kardel
>Release:        NetBSD 2.0C
>Organization:
	
>Environment:
	
	
System: NetBSD Orcus 2.0C NetBSD 2.0C (ORCUS32) #0: Fri Apr 30 16:16:37 CEST 2004  kardel@Orcus:/usr/src/sys/arch/i386/compile/ORCUS32 i386
Architecture: i386
Machine: i386
>Description:
	pakets sent to an interface having a vlan interface attached to them are replied to via the vlan interface. Thie leads to 
	802.1Q encapsulation for the response pakets which then are sent to the wrong vlan.
>How-To-Repeat:
	Switch with a port accepting/transmitting plain and 802.1Q frames. Default VLANID for this port (plain pakets) is 1.
	NetBSD machine with ex interface and following configuration:
	Orcus: {8} ifconfig ex0
	ex0: flags=8863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST> mtu 1500
		capabilities=7<IP4CSUM,TCP4CSUM,UDP4CSUM>
		enabled=7<IP4CSUM,TCP4CSUM,UDP4CSUM>
		address: 00:0a:5e:06:2c:62
		media: Ethernet autoselect (100baseTX full-duplex)
		status: active
		inet 10.0.2.14 netmask 0xffffff00 broadcast 10.0.2.255
		inet6 fe80::20a:5eff:fe06:2c62%ex0 prefixlen 64 scopeid 0x1
	Orcus: {9} ifconfig vlan1
	vlan1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
		vlan: 203 parent: ex0
		address: 00:0a:5e:06:2c:62
		inet 10.1.203.10 netmask 0xffffff00 broadcast 10.1.203.255
		inet6 fe80::20a:5eff:fe06:2c62%vlan1 prefixlen 64 scopeid 0xb
	Frames from the switch via VLANID 1 arrive plan in ex0.
	Frames from the switch via VLANID 203 arrive as 802.1Q frames arrive as such on ex0 and are passed on to vlan1-inteface.
	This works fine.
	Now to the problem:
	A TCP connection setup on VLANID 1 goes wrong. The SYN paket arrives correctly on ex0 - but the SYN/ACK is being sent as
	802.1Q frame via the vlan1 interface. According the ifconfig vlan1 I wouldn't expect a paket matching the interface configuration
	of ex0 to be transmitted via the vlan1 interface.
	See following trace:

	tcpdump -s 512 -v -n -i ex0 host 10.0.3.9 or \( vlan and host 10.0.3.9 \)
	tcpdump: listening on ex0
	09:51:59.870167 10.0.3.9.65520 > 10.0.2.14.22: P [tcp sum ok] 3427045988:3427046036(48) ack 3769364932 win 33580 <nop,nop,timestamp 135 17> (DF) [tos 0x10]  (ttl 62, id 23113, len 100)
	09:51:59.870360 802.1Q vlan#203 P0 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 1:49(48) ack 48 win 33580 <nop,nop,timestamp 136 135> (DF) [tos 0x10]  (ttl 64, id 421, len 100)
	09:51:59.870395 802.1Q vlan#203 P0 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 49:97(48) ack 48 win 33580 <nop,nop,timestamp 136 135> (DF) [tos 0x10]  (ttl 64, id 16033, len 100)
	09:51:59.978784 10.0.3.9.65520 > 10.0.2.14.22: . [tcp sum ok] ack 97 win 33532 <nop,nop,timestamp 136 136> (DF) [tos 0x10]  (ttl 62, id 23115, len 52)
	09:53:16.686514 10.0.3.9.65520 > 10.0.2.14.22: P [tcp sum ok] 48:96(48) ack 97 win 33580 <nop,nop,timestamp 289 136> (DF) [tos 0x10]  (ttl 62, id 23314, len 100)
	09:53:16.686659 802.1Q vlan#203 P0 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 97:145(48) ack 96 win 33580 <nop,nop,timestamp 289 289> (DF) [tos 0x10]  (ttl 64, id 43384, len 100)
	09:53:16.982932 10.0.3.9.65520 > 10.0.2.14.22: . [tcp sum ok] ack 145 win 33580 <nop,nop,timestamp 290 289> (DF) [tos 0x10]  (ttl 62, id 23315, len 52)
	^C

	tcpdump shows that the reply pakets actually pass through the vlan1 interface.

	Orcus: {12} tcpdump -s 512 -v -n -i vlan1 host 10.0.3.9 
	tcpdump: listening on vlan1
	10:12:12.510453 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 3769365124:3769365172(48) ack 3427046132 win 33580 <nop,nop,timestamp 2561 2560> (DF) [tos 0x10]  (ttl 64, id 50722, len 100)
	10:12:14.506415 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 4294967248:48(96) ack 1 win 33580 <nop,nop,timestamp 2565 2560> (DF) [tos 0x10]  (ttl 64, id 28473, len 148)
	10:12:18.051561 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 48:96(48) ack 49 win 33580 <nop,nop,timestamp 2572 2571> (DF) [tos 0x10]  (ttl 64, id 24405, len 100)
	10:12:18.051599 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 96:144(48) ack 49 win 33580 <nop,nop,timestamp 2572 2571> (DF) [tos 0x10]  (ttl 64, id 20297, len 100)
	10:12:18.234233 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 144:192(48) ack 97 win 33580 <nop,nop,timestamp 2572 2572> (DF) [tos 0x10]  (ttl 64, id 39200, len 100)
	10:12:18.234267 10.0.2.14.22 > 10.0.3.9.65520: P [tcp sum ok] 192:240(48) ack 97 win 33580 <nop,nop,timestamp 2572 2572> (DF) [tos 0x10]  (ttl 64, id 52004, len 100)

	Because of the paket being sent via the wrong interface they seldom reach the intended interface and cause some head scratching.

>Fix:
	check outgoing interface selection code wrt/ vlan interfaces?
>Release-Note:
>Audit-Trail:
>Unformatted: