Subject: proposed kernel change for IP_HDRINCL
To: None <freebsd-arch@freebsd.org, tech-net@NetBSD.ORG>
From: Kurt J. Lidl <lidl@va.pubnix.com>
List: tech-net
Date: 10/18/1996 16:22:30
I'm trying to coordinate this "fixing the interface" issue with
three groups:

	BSDi
	NetBSD
	OpenBSD

So far, BSDi (which is the system where I first noticed this issue)
has committed to fixing this interface if both the FreeBSD and NetBSD
groups will commit to fixing it also.

One line description:
---------------------

There is a byte order dependency problem with the IP_HDRINCL option
to setsockopt()

Full Description:
-----------------

Currently, when using setsockopt() with the IP_HDRINCL flag, most
fields of the IP header are passed into the kernel in network byte
order, except for the ip_off and ip_len fields.  Any IP checksum
that is placed on the packet in user-mode will be incorrect on
little-endian machines, because those two fields are not in network
byte order.  Internally, the kernel carries around mbufs with the
ip_off and ip_len fields host byte order.  The kernel then switches
those two fields to network order just before queuing a mbuf for
an interface.  To make the kernel's network API consistant, the
ip_off and ip_len fields should be changed to be passed in network
byte order, not host byte order.  The kernel should then change
the fields to the host byte order for ip_off and ip_len, since that
is what the rest of the internal networking code are expecting.

How to fix the problem:
-----------------------
Here's the diff for BSDi's kernel tree, but I expect it will be
similar or identical in both the NetBSD and FreeBSD kernel trees.

*** raw_ip.c    1996/08/28 19:30:23     1.2
--- raw_ip.c    1996/10/07 18:38:06     1.3
***************
*** 153,158 ****
--- 153,161 ----
                opts = inp->inp_options;
        } else {
                ip = mtod(m, struct ip *);
+               /* ip_output expects these in host byte order */
+               NTOHS(ip->ip_len);
+               NTOHS(ip->ip_off);
                if (ip->ip_len > m->m_pkthdr.len) {
                        m_freem(m);
                        return (EMSGSIZE);

As well, the utilites that use IP_HDRINCL setsockopt() flag need
to be changed.  On BSD/OS (v2.1), this is only traceroute and
mrouted.  I've included patches for those two programs at the
end of this message.

Any other programs that use the IP_HDRINCL flag would also need to
be changed.

NOTE:
-----
This change will break backwards binary compatibility with programs
that use IP_HDRINCL and don't ntohs() the arguments for little
endian machines.

Please reply and let me know if you are willing to make this change.

Thanks for your consideration,

-Kurt

Contact information:

Email: lidl@uu.net
       lidl@va.pubnix.com

Kurt J. Lidl
UUNET Technologies
3060 Williams Drive
Fairfax, VA 20031

+1 703 206 5836 (voice)
+1 703 206 5601 (fax)


*** traceroute.c        1996/10/09 14:45:19     1.1
--- traceroute.c        1996/10/09 14:59:00
***************
*** 582,591 ****
        struct udphdr *up = &op->udp;
        int i;
  
!       ip->ip_off = 0;
        ip->ip_hl = sizeof(*ip) >> 2;
        ip->ip_p = IPPROTO_UDP;
!       ip->ip_len = datalen;
        ip->ip_ttl = ttl;
        ip->ip_v = IPVERSION;
        ip->ip_id = htons(ident+seq);
--- 582,591 ----
        struct udphdr *up = &op->udp;
        int i;
  
!       ip->ip_off = htons(0);
        ip->ip_hl = sizeof(*ip) >> 2;
        ip->ip_p = IPPROTO_UDP;
!       ip->ip_len = htons(datalen);
        ip->ip_ttl = ttl;
        ip->ip_v = IPVERSION;
        ip->ip_id = htons(ident+seq);


*** igmp.c	1996/10/18 20:12:38	1.1
--- igmp.c	1996/10/18 20:13:57
***************
*** 56,62 ****
      ip->ip_hl  = sizeof(struct ip) >> 2;
      ip->ip_v   = IPVERSION;
      ip->ip_tos = 0;
!     ip->ip_off = 0;
      ip->ip_p   = IPPROTO_IGMP;
      ip->ip_ttl = MAXTTL;	/* applies to unicasts only */
  
--- 56,62 ----
      ip->ip_hl  = sizeof(struct ip) >> 2;
      ip->ip_v   = IPVERSION;
      ip->ip_tos = 0;
!     ip->ip_off = htons(0);
      ip->ip_p   = IPPROTO_IGMP;
      ip->ip_ttl = MAXTTL;	/* applies to unicasts only */
  
***************
*** 315,321 ****
      ip                      = (struct ip *)send_buf;
      ip->ip_src.s_addr       = src;
      ip->ip_dst.s_addr       = dst;
!     ip->ip_len              = MIN_IP_HEADER_LEN + IGMP_MINLEN + datalen;
  
      igmp                    = (struct igmp *)(send_buf + MIN_IP_HEADER_LEN);
      igmp->igmp_type         = type;
--- 315,321 ----
      ip                      = (struct ip *)send_buf;
      ip->ip_src.s_addr       = src;
      ip->ip_dst.s_addr       = dst;
!     ip->ip_len              = htons(MIN_IP_HEADER_LEN + IGMP_MINLEN + datalen);
  
      igmp                    = (struct igmp *)(send_buf + MIN_IP_HEADER_LEN);
      igmp->igmp_type         = type;