Subject: Re: IP_HDRINCL send on little-endian machine causes kernel panic
To: Charles M. Hannum <mycroft@MIT.EDU>
From: Jonathan Stone <jonathan@dsg.stanford.edu>
List: tech-kern
Date: 07/27/1996 16:58:59
 mycroft@mit.edu (Charles M. Hannum) writes:

>> If that's the intended user/kernel interface, [...]

>Certainly not.  It should accept the fields in network byte order.


I agree.  But since raw sockets are supposed to offer a close
simulacrum of an in-kernel environment, and the in-kernel environment
was changed by CSRG, it's possible someone from CSRG might have
deliberately chosen the current interface.


>It
>also seems like the ip_output() interface should be changed to use
>network order for consistency.

Consistency with *what*?

The 4.4bsd networking stack deliberately keeps the ip_len field
in host byteorder above ip_output() and ipintr(). ip_intr() byteswaps
the ip_len and ip_off early on; ip_output() similarly byteswaps them.
IIRC, this was a deliberate change made for efficiency and clarity
reasons. (it makes comparison with the received-packet length in a
pkthdr marginally simpler.)  I no longer have access to earlier 4bsd
release tapes, so I can't say for sure, but I beleive it was done
approximately around Reno.

Are you simply unaware of this (deliberate) change, or do you disagree
with the change?

>It's an IP header.  The header format is well-defined.

You don't seem to understand that the in-kernel ADT _does_not_use
an IP header. It uses a header with certain fields byteswapped.

I agree that the user-level raw-socket interface should specify
network-order IP headers.  That's not what the current implementation
does.  It seems not-implausible to argue, for instance, that the raw
socket abstraction exists in part to aid user-level prototyping of
protocols, before they are implemented inside the kernel.  For such
cases, faithfully exporting the in-kernel interface, including the
host-endian-ness of ip_len and ip_off, isn't totally indefensible.


>> and (b)
>> shouldn't panic if the user injects a bad packet with a "bad" IP
>> length.

>That's a bug.

Here's the fix I've been using.   EMSGSIZE seems to fit semantically,
thought the text string associated with it says "message too long"...


*** raw_ip.c.DIST	Sun May 26 04:42:43 1996
--- raw_ip.c	Mon Jul 22 19:58:33 1996
***************
*** 186,191 ****
--- 186,207 ----
  		if (ip->ip_id == 0)
  			ip->ip_id = htons(ip_id++);
  		opts = NULL;
+ #ifdef IPHDRINCL_IN_NETWORK_BYTE_ORDER
+ 		/* XXX ip_output assumes ip_len and ip_off in host byteorder */
+ 		HTONS(ip->ip_len);
+ 		HTONS(ip->ip_off);
+ #endif
+ 		if (m->m_pkthdr.len < ip->ip_len ||
+ 		    (ip->ip_hl << 2) > ip->ip_len) {
+ #if 1 /* DIAGNOSTIC */
+ 		      printf(
+ 			     "rip_output: ip_len %d, pkthdrlen %d, ip_hl %d\n",
+ 			     ip->ip_len, m->m_pkthdr.len, ip->ip_hl);
+ #endif
+ 		      m_freem(m);
+ 			return(EMSGSIZE);
+ 		}
+ 
  		/* XXX prevent ip_output from overwriting header fields */
  		flags |= IP_RAWOUTPUT;
  		ipstat.ips_rawout++;