Subject: Re: gsip sends byte-swapped vlan tags
To: None <tech-net@netbsd.org>
From: Pavel Cahyna <pavel.cahyna@st.mff.cuni.cz>
List: tech-net
Date: 01/25/2006 01:45:53
On Tue, Jan 24, 2006 at 10:02:51AM -0800, Jason Thorpe wrote:
> 
> On Jan 24, 2006, at 9:15 AM, Pavel Cahyna wrote:
> 
> >I can confirm that using bswap16 makes it send correct tag from a BE
> >machine (sgimips). To correctly receive tags, it is necessary to change
> >another already existing ntohs to bswap16. Then the tagged VLAN work
> >correctly. This final patch is below.
> 
> Please add some comments explaining why an unconditional swap is being used.

OK. My explanation is (review carefully, I have zero knowledge of this
hardware):

The hardware expects the value in the network byte order. So on a
little-endian machine we need to swap. But the value stored in the Tx
descriptor (which has the VLAN tag embedded in some of its bits) is
converted with htole32, which would spoil the byte ordering on big-endian
machines. So, on big endian machines, we also need to swap the value, and
htole32() will swap it again, resulting in a correct value. This
effectively means that we need to swap always. Same on input. Updated
patch follows. (diff #1.)

BTW I have completely untested diff (also attached) for other drivers
(vge, re)  which seem to have the same bug (swapping with htons/ntohs
values which are also swapped wiht htole32/le32toh). Also I made cosmetic
diffs to if_bge.c and i82557.c to use the common macros.

BTW it_stge, if_ti, if_bge and if_txp don't swap descriptors at all. Is
the VLAN handling in those drivers correct? 

And those lines in if_txp.c look strange:


		if (rxd->rx_stat & htole32(RX_STAT_VLAN)) {
			VLAN_INPUT_TAG(ifp, m, htons(rxd->rx_vlan >> 16),
			    continue);
		}

why htons instead of ntohs? (it does not matter but still, ntohs
looks more correct.)

Pavel

diff#1:


--- if_sip.c.~1.101.~	Sun Feb 27 01:27:33 2005
+++ if_sip.c	Wed Jan 25 00:51:11 2006
@@ -1354,10 +1354,20 @@
 		 * This apparently has to be on the last descriptor of
 		 * the packet.
 		 */
+
+		/*
+		 * Byte swapping is tricky. We need to provide the tag
+		 * in a network byte order. On a big-endian machine,
+		 * the byteorder is correct, but we need to swap it
+		 * anyway, because this will be undone by the outside
+		 * htole32(). That's why there must be an
+		 * unconditional swap instead of htons() inside.
+		 */
 		if ((mtag = VLAN_OUTPUT_TAG(&sc->sc_ethercom, m0)) != NULL) {
 			sc->sc_txdescs[lasttx].sipd_extsts |=
-			    htole32(EXTSTS_VPKT |
-				    (VLAN_TAG_VALUE(mtag) & EXTSTS_VTCI));
+			    htole32(EXTSTS_VPKT | 
+					(bswap16(VLAN_TAG_VALUE(mtag)) &
+					 EXTSTS_VTCI));
 		}
 
 		/*
@@ -1969,8 +1979,19 @@
 		 * If VLANs are enabled, VLAN packets have been unwrapped
 		 * for us.  Associate the tag with the packet.
 		 */
+
+		/*
+		 * Again, byte swapping is tricky. Hardware provided
+		 * the tag in the network byte order, but extsts was
+		 * passed through le32toh() in the meantime. On a
+		 * big-endian machine, we need to swap it again. On a
+		 * little-endian machine, we need to convert from the
+		 * network to host byte order. This means that we must
+		 * swap it in any case, so unconditional swap instead
+		 * of htons() is used.
+		 */
 		if ((extsts & EXTSTS_VPKT) != 0) {
-			VLAN_INPUT_TAG(ifp, m, ntohs(extsts & EXTSTS_VTCI),
+			VLAN_INPUT_TAG(ifp, m, bswap16(extsts & EXTSTS_VTCI),
 			    continue);
 		}
 

diff#2:


Index: ic/i82557.c
===================================================================
RCS file: /home/pavel/cvs/src/sys/dev/ic/i82557.c,v
retrieving revision 1.96
diff -u -u -r1.96 i82557.c
--- ic/i82557.c	24 Dec 2005 20:27:30 -0000	1.96
+++ ic/i82557.c	24 Jan 2006 23:09:02 -0000
@@ -1041,7 +1041,7 @@
 			vtag = VLAN_OUTPUT_TAG(&sc->sc_ethercom, m0);
 			if (vtag) {
 				ipcb->ipcb_vlan_id =
-				    htobe16(*(u_int *)(vtag + 1));
+				    htobe16(VLAN_TAG_VALUE(vtag));
 				ipcb->ipcb_ip_activation_high |=
 				    FXP_IPCB_INSERTVLAN_ENABLE;
 			}
Index: ic/rtl8169.c
===================================================================
RCS file: /home/pavel/cvs/src/sys/dev/ic/rtl8169.c,v
retrieving revision 1.22
diff -u -u -r1.22 rtl8169.c
--- ic/rtl8169.c	11 Dec 2005 12:21:28 -0000	1.22
+++ ic/rtl8169.c	24 Jan 2006 23:08:21 -0000
@@ -1278,7 +1278,7 @@
 
 		if (rxvlan & RTK_RDESC_VLANCTL_TAG) {
 			VLAN_INPUT_TAG(ifp, m,
-			     be16toh(rxvlan & RTK_RDESC_VLANCTL_DATA),
+			     bswap16(rxvlan & RTK_RDESC_VLANCTL_DATA),
 			     continue);
 		}
 #if NBPFILTER > 0
@@ -1642,7 +1642,7 @@
 
 	if ((mtag = VLAN_OUTPUT_TAG(&sc->ethercom, m)) != NULL) {
 		sc->rtk_ldata.rtk_tx_list[startidx].rtk_vlanctl =
-		    htole32(htons(VLAN_TAG_VALUE(mtag)) |
+		    htole32(bswap16(VLAN_TAG_VALUE(mtag)) |
 		    RTK_TDESC_VLANCTL_TAG);
 	}
 
Index: pci/if_bge.c
===================================================================
RCS file: /home/pavel/cvs/src/sys/dev/pci/if_bge.c,v
retrieving revision 1.102
diff -u -u -r1.102 if_bge.c
--- pci/if_bge.c	24 Dec 2005 20:27:42 -0000	1.102
+++ pci/if_bge.c	24 Jan 2006 23:19:30 -0000
@@ -3710,8 +3710,7 @@
 		return(ENOBUFS);
 	}
 
-	mtag = sc->ethercom.ec_nvlans ?
-	    m_tag_find(m_head, PACKET_TAG_VLAN, NULL) : NULL;
+	mtag = VLAN_OUTPUT_TAG(&sc->ethercom, m_head);
 
 
 	/* Iterate over dmap-map fragments. */
Index: pci/if_vge.c
===================================================================
RCS file: /home/pavel/cvs/src/sys/dev/pci/if_vge.c,v
retrieving revision 1.8
diff -u -u -r1.8 if_vge.c
--- pci/if_vge.c	24 Dec 2005 20:27:42 -0000	1.8
+++ pci/if_vge.c	24 Jan 2006 23:08:39 -0000
@@ -1337,7 +1337,7 @@
 
 		if (rxstat & VGE_RDSTS_VTAG)
 			VLAN_INPUT_TAG(ifp, m,
-			    ntohs((rxctl & VGE_RDCTL_VLANID)), continue);
+			    bswap16((rxctl & VGE_RDCTL_VLANID)), continue);
 
 #if NBPFILTER > 0
 		/*
@@ -1668,6 +1668,6 @@
 	mtag = VLAN_OUTPUT_TAG(&sc->sc_ethercom, m_head);
 	if (mtag != NULL)
 		sc->vge_ldata.vge_tx_list[idx].vge_ctl |=
-		    htole32(htons(VLAN_TAG_VALUE(mtag)) | VGE_TDCTL_VTAG);
+		    htole32(bswap16(VLAN_TAG_VALUE(mtag)) | VGE_TDCTL_VTAG);
 
 	sc->vge_ldata.vge_tx_list[idx].vge_sts |= htole32(VGE_TDSTS_OWN);