Subject: kern/13131: there's no way for raw IPv4 socket applications to use PMTUD
To: None <gnats-bugs@gnats.netbsd.org>
From: None <itojun@itojun.rog>
List: netbsd-bugs
Date: 06/07/2001 12:12:30
>Number:         13131
>Category:       kern
>Synopsis:       there's no way for raw IPv4 socket applications to use PMTUD
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jun 06 20:11:01 PDT 2001
>Closed-Date:
>Last-Modified:
>Originator:     Jun-ichiro itojun Hagino
>Release:        1.5W
>Organization:
	itojun.org
>Environment:
System: NetBSD starfruit.itojun.org 1.5W NetBSD 1.5W (STARFRUIT) #497: Tue Jun 5 10:52:25 JST 2001 itojun@starfruit.itojun.org:/usr/home/itojun/NetBSD/src/sys/arch/i386/compile/STARFRUIT i386
Architecture: i386
Machine: i386
>Description:
	we have a new path MTU discovery validation code in TCP layer.  ICMPv4
	does not handle ICMPv4 need fragment by default; ICMPv4 layer needs
	to be informed of need fragment messages, from L4 code (foo_ctlinput)
	by a call to icmp_mtudisc().

	with raw IPv4 socket (IP_HDRINCL case), we can generate IPv4 packet
	with DF bit set.  however, we do not have any foo_ctlinput logic, nor
	the calls to icmp_mtudisc(), so PMTUD won't happen.  userland code
	cannot run PMTUD by itself, as ICMPv4 need fragment messages will not
	be presented to the userland programs.

	4.4BSD code accepts any ICMPv4 need fragment messages, so raw IPv4
	socket can assume that the kernel will do PMTUD automatically.
	(however, 4.4BSD code is vulnerable against DoS attempts that use
	ICMPv4 need fragment storm - routing table entries will overflow
	the kernel memory)


	other L4:

	TCP layer do call icmp_mtudisc(), so it can run PMTUD.

	UDP layer has no problem, since we cannot throw any UDP packet with
	DF bit set with the current API.
>How-To-Repeat:
	code inspection
>Fix:
	- the issue is very minor issue, so leave it as is.
	- have rip_ctlinput() and call icmp_mtudisc() as necessary - i plan
	  to do this anyways, to solve issues with stale inp->inp_route
	  pointers.
	- just like ICMPv6 sockets, throw ICMPv4 messages up to listening
	  raw sockets as necessary.  ask IP_HDRINCL userland apps to handle
	  ICMPv4 need fragment messages. (too much kernel API change)
>Release-Note:
>Audit-Trail:
>Unformatted: