Subject: proposed kernel change for IP_HDRINCL
To: None <freebsd-arch@freebsd.org, tech-net@NetBSD.ORG>
From: Kurt J. Lidl <lidl@va.pubnix.com>
List: tech-net
Date: 10/18/1996 16:22:30
I'm trying to coordinate this "fixing the interface" issue with
three groups:
BSDi
NetBSD
OpenBSD
So far, BSDi (which is the system where I first noticed this issue)
has committed to fixing this interface if both the FreeBSD and NetBSD
groups will commit to fixing it also.
One line description:
---------------------
There is a byte order dependency problem with the IP_HDRINCL option
to setsockopt()
Full Description:
-----------------
Currently, when using setsockopt() with the IP_HDRINCL flag, most
fields of the IP header are passed into the kernel in network byte
order, except for the ip_off and ip_len fields. Any IP checksum
that is placed on the packet in user-mode will be incorrect on
little-endian machines, because those two fields are not in network
byte order. Internally, the kernel carries around mbufs with the
ip_off and ip_len fields host byte order. The kernel then switches
those two fields to network order just before queuing a mbuf for
an interface. To make the kernel's network API consistant, the
ip_off and ip_len fields should be changed to be passed in network
byte order, not host byte order. The kernel should then change
the fields to the host byte order for ip_off and ip_len, since that
is what the rest of the internal networking code are expecting.
How to fix the problem:
-----------------------
Here's the diff for BSDi's kernel tree, but I expect it will be
similar or identical in both the NetBSD and FreeBSD kernel trees.
*** raw_ip.c 1996/08/28 19:30:23 1.2
--- raw_ip.c 1996/10/07 18:38:06 1.3
***************
*** 153,158 ****
--- 153,161 ----
opts = inp->inp_options;
} else {
ip = mtod(m, struct ip *);
+ /* ip_output expects these in host byte order */
+ NTOHS(ip->ip_len);
+ NTOHS(ip->ip_off);
if (ip->ip_len > m->m_pkthdr.len) {
m_freem(m);
return (EMSGSIZE);
As well, the utilites that use IP_HDRINCL setsockopt() flag need
to be changed. On BSD/OS (v2.1), this is only traceroute and
mrouted. I've included patches for those two programs at the
end of this message.
Any other programs that use the IP_HDRINCL flag would also need to
be changed.
NOTE:
-----
This change will break backwards binary compatibility with programs
that use IP_HDRINCL and don't ntohs() the arguments for little
endian machines.
Please reply and let me know if you are willing to make this change.
Thanks for your consideration,
-Kurt
Contact information:
Email: lidl@uu.net
lidl@va.pubnix.com
Kurt J. Lidl
UUNET Technologies
3060 Williams Drive
Fairfax, VA 20031
+1 703 206 5836 (voice)
+1 703 206 5601 (fax)
*** traceroute.c 1996/10/09 14:45:19 1.1
--- traceroute.c 1996/10/09 14:59:00
***************
*** 582,591 ****
struct udphdr *up = &op->udp;
int i;
! ip->ip_off = 0;
ip->ip_hl = sizeof(*ip) >> 2;
ip->ip_p = IPPROTO_UDP;
! ip->ip_len = datalen;
ip->ip_ttl = ttl;
ip->ip_v = IPVERSION;
ip->ip_id = htons(ident+seq);
--- 582,591 ----
struct udphdr *up = &op->udp;
int i;
! ip->ip_off = htons(0);
ip->ip_hl = sizeof(*ip) >> 2;
ip->ip_p = IPPROTO_UDP;
! ip->ip_len = htons(datalen);
ip->ip_ttl = ttl;
ip->ip_v = IPVERSION;
ip->ip_id = htons(ident+seq);
*** igmp.c 1996/10/18 20:12:38 1.1
--- igmp.c 1996/10/18 20:13:57
***************
*** 56,62 ****
ip->ip_hl = sizeof(struct ip) >> 2;
ip->ip_v = IPVERSION;
ip->ip_tos = 0;
! ip->ip_off = 0;
ip->ip_p = IPPROTO_IGMP;
ip->ip_ttl = MAXTTL; /* applies to unicasts only */
--- 56,62 ----
ip->ip_hl = sizeof(struct ip) >> 2;
ip->ip_v = IPVERSION;
ip->ip_tos = 0;
! ip->ip_off = htons(0);
ip->ip_p = IPPROTO_IGMP;
ip->ip_ttl = MAXTTL; /* applies to unicasts only */
***************
*** 315,321 ****
ip = (struct ip *)send_buf;
ip->ip_src.s_addr = src;
ip->ip_dst.s_addr = dst;
! ip->ip_len = MIN_IP_HEADER_LEN + IGMP_MINLEN + datalen;
igmp = (struct igmp *)(send_buf + MIN_IP_HEADER_LEN);
igmp->igmp_type = type;
--- 315,321 ----
ip = (struct ip *)send_buf;
ip->ip_src.s_addr = src;
ip->ip_dst.s_addr = dst;
! ip->ip_len = htons(MIN_IP_HEADER_LEN + IGMP_MINLEN + datalen);
igmp = (struct igmp *)(send_buf + MIN_IP_HEADER_LEN);
igmp->igmp_type = type;