Subject: kern/32638: ath(4) driver mis-interprets received packets
To: None <kern-bug-people@netbsd.org, gnats-admin@netbsd.org,>
From: None <he@uninett.no>
List: netbsd-bugs
Date: 01/26/2006 11:25:00
>Number:         32638
>Category:       kern
>Synopsis:       ath(4) driver mis-interprets received packets
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Jan 26 11:25:00 +0000 2006
>Originator:     Havard Eidnes
>Release:        NetBSD 3.99.15
>Organization:
	UNINETT AS
>Environment:
System: NetBSD vestlia.uninett.no 3.99.15 NetBSD 3.99.15 (VESTLIA) #5: Fri Jan 13 15:54:20 CET 2006 he@vestlia.uninett.no:/usr/obj/sys/arch/i386/compile/VESTLIA i386
Architecture: i386
Machine: i386
>Description:
	At work we use a non-broadcast SSID and WEP.

	On the first association after a reboot of my laptop,
	the ath(4) driver works fine: it gets an association,
	it gets an IP address via DHCP, and the network driver
	works fine, both for IPv4 and IPv6.

	However, after bringing down the ath0 interface, and
	using the laptop on a wired network for a while (the
	exact conditions to trigger the following problem is
	not fully characterized), a subsequent attempt to use
	the wireless network interface fails.

	The observed symptom is as follows:

	 o The network interface manages to associate with
	   a base station on our wireless network; "ifconfig ath0"
	   says "status: active", and e.g.
	   "bssid 00:12:44:b5:b9:a1 chan 108".

	 o Attempts to get an IP address with dhcp fails.

	I have looked at what tcpdump has to say about the traffic
	it sees on the wireless network interface, both when it
	succeeds in getting an IP address and when it fails.

	The success pattern looks (unsurprisingly) like this (yes,
	there is some other "chatter" included here):

11:51:24.012388 00:05:4e:4a:d7:8f > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:05:4e:4a:d7:8f, length: 300
11:51:24.015093 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, ethertype IPv4 (0x0800), length 62: IP 158.38.60.10 > 158.38.61.15: icmp 28: echo request seq 0
11:51:24.229502 00:0b:5f:f6:70:c6 > 01:80:c2:00:00:00, 802.3, length 52: LLC, dsap STP (0x42), ssap STP (0x42), cmd 0x03, 802.1d config 8000.00:0b:5f:f6:70:c0.8012 root 8000.00:04:c1:c8:d0:c0 pathcost 39 age 4 max 20 hello 2 f delay 15 
11:51:25.013131 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, ethertype IPv4 (0x0800), length 369: IP 158.38.61.1.67 > 158.38.61.15.68: BOOTP/DHCP, Reply, length: 327
11:51:25.013765 00:05:4e:4a:d7:8f > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:05:4e:4a:d7:8f, length: 300
11:51:25.052781 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, ethertype IPv4 (0x0800), length 369: IP 158.38.61.1.67 > 158.38.61.15.68: BOOTP/DHCP, Reply, length: 327
11:51:25.062241 00:05:4e:4a:d7:8f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: arp who-has 158.38.61.15 tell 158.38.61.15
11:51:25.208035 00:0b:5f:f6:70:c6 > 01:00:0c:cc:cc:cd, 802.3, length 64: LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03, sap aa ui/C len=39
11:51:25.229850 00:0b:5f:f6:70:c6 > 01:80:c2:00:00:00, 802.3, length 52: LLC, dsap STP (0x42), ssap STP (0x42), cmd 0x03, 802.1d config 8000.00:0b:5f:f6:70:c0.8012 root 8000.00:04:c1:c8:d0:c0 pathcost 39 age 4 max 20 hello 2 f delay 15 

	The failure pattern looks like this:

11:42:18.017784 00:05:4e:4a:d7:8f > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:05:4e:4a:d7:8f, length: 300
11:42:18.033161 00:09:b7:5c:e3:ff > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 60: arp who-has 158.38.61.15 tell 158.38.61.1
11:42:18.466009 00:05:3c:06:ed:11 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.78.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:18.466922 00:05:3c:06:ed:11 > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.78.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:18.647358 00:05:4e:42:08:ec > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.150.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:18.647434 00:05:4e:42:08:ec > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.150.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:18.843503 00:0b:5f:f6:70:c6 > 01:80:c2:00:00:00, 802.3, length 52: LLC, dsap STP (0x42), ssap STP (0x42), cmd 0x03, 802.1d config 8000.00:0b:5f:f6:70:c0.8012 root 8000.00:04:c1:c8:d0:c0 pathcost 39 age 4 max 20 hello 2 f
delay 15 
11:42:19.024169 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, 802.3, length 377: LLC, dsap Unknown (0xbc), ssap Unknown (0xfa), cmd 0xb1, sap fa > sap bc rr (r=24,C) len=359
11:42:19.242281 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, 802.3, length 70: LLC, dsap Unknown (0x88), ssap SNAP (0xaa), cmd 0x81, sap aa > sap 88 rr (r=52,P) len=52
...
11:42:22.017545 00:05:4e:4a:d7:8f > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 342: IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:05:4e:4a:d7:8f, length: 300
11:42:22.021209 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, 802.3, length 377: LLC, dsap Unknown (0xe6), ssap Unknown (0xbf), cmd 0xc2, sap be > sap e6 I (s=97,r=38,R) len=359
11:42:22.025462 00:05:4e:41:b8:de > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.149.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:22.660603 00:09:b7:5c:e3:ff > 00:05:4e:46:31:7b, 802.3, length 74: LLC, dsap Unknown (0x57), ssap Unknown (0x21), cmd 0x60, sap 20 > sap 57 I (s=48,r=34,R) len=56
11:42:22.775479 00:05:4e:41:b8:de > ff:ff:ff:ff:ff:ff, ethertype IPv4 (0x0800), length 92: IP 158.38.61.149.137 > 158.38.61.255.137: NBT UDP PACKET(137): QUERY; REQUEST; BROADCAST
11:42:22.881988 00:0b:5f:f6:70:c6 > 01:80:c2:00:00:00, 802.3, length 52: LLC, dsap STP (0x42), ssap STP (0x42), cmd 0x03, 802.1d config 8000.00:0b:5f:f6:70:c0.8012 root 8000.00:04:c1:c8:d0:c0 pathcost 39 age 5 max 20 hello 2 f delay 15 

	The IEEE address of ath0 on this laptop is 00:05:4e:4a:d7:8f.

	A couple of observations:

	 o The reply from the DHCP relay agent is mis-interpreted as
	   using 802.3 encapsulation on the unicast reply, ref.

11:42:22.021209 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, 802.3, length 377: LLC, dsap Unknown (0xe6), ssap Unknown (0xbf), cmd 0xc2, sap be > sap e6 I (s=97,r=38,R) len=359

	   versus

11:51:25.013131 00:09:b7:5c:e3:ff > 00:05:4e:4a:d7:8f, ethertype IPv4 (0x0800), length 369: IP 158.38.61.1.67 > 158.38.61.15.68: BOOTP/DHCP, Reply, length: 327

	   in the "working" case.

	 o Apparently not all traffic is mis-interpreted as having
	   802.3 encapsulation.  However, it is evident that most of
	   the traffic which is unicast, either to the laptop under
	   test or to other stations on the network will be mis-
	   interpreted.  Compare e.g.

11:51:33.073095 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, ethertype IPv4 (0x0800), length 1314: IP 194.63.248.54.587 > 158.38.61.24.3855: . 1610:2870(1260) ack 120 win 5840
11:51:33.082851 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, ethertype IPv4 (0x0800), length 1155: IP 194.63.248.54.587 > 158.38.61.24.3855: P 2870:3971(1101) ack 120 win 5840
11:51:33.109208 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, ethertype IPv4 (0x0800), length 113: IP 194.63.248.54.587 > 158.38.61.24.3855: P 3971:4030(59) ack 254 win 5840

	     with

11:42:11.541342 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, 802.3, length 1322: LLC, dsap Unknown (0xc0), ssap Unknown (0x1d), cmd 0x58, sap 1c > sap c0 I (s=44,r=41,R) len=1304
11:42:11.552995 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, 802.3, length 1322: LLC, dsap Unknown (0xb3), ssap Unknown (0x96), cmd 0x5e, sap 96 > sap b3 I (s=47,r=118,P) len=1304
11:42:11.553836 00:09:b7:5c:e3:ff > 00:0e:9b:98:ea:8b, 802.3, length 1322: LLC, dsap Unknown (0x88), ssap Unknown (0x5a), cmd 0x76, sap 5a > sap 88 I (s=59,r=119,P) len=1304

	 o When the interface is mis-interpreting received packets,
	   I also see occasional occurrances of

ath0: discarding oversize frame (len=1522)

	   lines logged on the console.

	 o A reboot and re-configuration of ath0 with the exact same
	   ssid and wep key as in the "doesn't work" case brings the
	   network interface back into working order.

	   ...and all this time I had beleived that "reboot to fix"
	   was a joy reserved for the users of products from Redmond,
	   Washington, USA...  Sigh!

>How-To-Repeat:
	Try to repeat the usage pattern as described above.
	Watch received packets being mis-interpreted by either the
	driver or the 802.11 infrastructure code on second-or-third
	attempt to use the ath(4) interface.

>Fix:
	Sorry, I do not know.
	I am receptive to hints to try to diagnose this problem closer.
	As a wild guess, this smells of an un-initialized variable
	somewhere, but what do I know; the bug may be significantly more
	complex...