Subject: kern/5560: ppp with vj compression + ipflow == smashed packets
To: None <gnats-bugs@gnats.netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: netbsd-bugs
Date: 06/09/1998 23:28:49
>Number:         5560
>Category:       kern
>Synopsis:       ppp with vj compression + ipflow == smashed packets
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jun  9 16:35:02 1998
>Last-Modified:
>Originator:     Bill Sommerfeld
>Organization:
	
>Release:        19980609
>Environment:
	
System: NetBSD orchard.arlington.ma.us 1.3F NetBSD 1.3F (ORCHARDII) #19: Tue Jun 9 15:48:49 EDT 1998 sommerfeld@orchard.arlington.ma.us:/d3/NetBSD-current/src/sys/arch/i386/compile/ORCHARDII i386


>Description:

I noticed poor tcp performance.  After some digging, I discovered
that it appeared to be due to packets being improperly reconstituted
by vj compression.

(I now have hacks to tcpdump to verify the tcp checksum and print out
any discrepancies found..)

on one side, i see:

19:06:47.614887 praxis.epilogue.com.1022 > 128.224.138.130.ssh: . ack 20116 win 4096 (ttl 59, id 54023)
                         4500 0028 d307 0000 3b06 1ed9 80e0 01ad
                         80e0 8a82 03fe 0016 3abd 1e50 5306 b548
                         5010 1000 ac74 0000 0000 0006 0000

while on the other side, i see:

19:06:47.621174 praxis.epilogue.com.1022 > 128.224.138.130.ssh: [bad tcp cksum f3ff!] . 592:598(6) ack 20116 win 4096 (ttl 59, id 54023)
                         4500 002e d307 0000 3b06 1ed3 80e0 01ad
                         80e0 8a82 03fe 0016 3abd 1e50 5306 b548
                         5010 1000 ac74 0000 0000 0006 0000

The fields which changed:
	-> ip header checksum changed (duh)
	-> ip length went from 0x28 to 0x2e (!)

--

this stopped happening when i did one of the following:
	a) turn off vj compression (`novj' in .ppprc), or
	b) turn off flow-based routing (sysctl -w net.inet.ip.maxflows=0)

The reconstituted ip length of 0x2e (46 bytes) is perhaps not
coincidentally just 18 bytes short of an minimum-sized ethernet packet
(18 bytes being 2 ethernet addresses, an ether type, and a 4-byte
CRC..); my *guess* is that something along the way isn't trimming gunk
off the end of an mbuf..

In particular, the following code in ip_input is bypassed by the
fastforward case..

	len = ip->ip_len;

	/*
	 * ...
	 * Trim mbufs if longer than we expect.
	 * ...
	 */
	...
	if (m->m_pkthdr.len > len) {
		if (m->m_len == m->m_pkthdr.len) {
			m->m_len = len;
			m->m_pkthdr.len = len;
		} else
			m_adj(m, len - m->m_pkthdr.len);
	}

Hmmm....

>How-To-Repeat:
	set up a ppp link between a -current system built with options GATEWAY 
	make sure vj compression is turned on.
	watch tcp suck badly.

>Fix:
	Add code akin to the above to ip_flow.c.


>Audit-Trail:
>Unformatted: