Subject: unconnected inpcb and redirects
To: None <tech-net@netbsd.org>
From: Jun-ichiro itojun Hagino <itojun@iijlab.net>
List: tech-net
Date: 12/05/2000 00:52:26
	it looks that in_pcbnotify() needs to flush inpcb.inp_route more
	frequently for unconnected sockets, on ICMP redirect case.
	scneario is like this:
	- from unconnected inpcb, packet is sent to final destination A,
	  using gateway B.  inpcb caches the routing entry into inpcb.inp_route.
	- B was not the best gateway, so B sends an ICMP redirect.
	- because we have used an unconnected inpcb, the inpcb will not be
	  notified of ICMP redirect (see in_pcbnotify), and will keep an
	  obsolete cache entry in inpcb.inp_route.
	- ICMP redirects will be issued every time we send a packet from
	  the unconnected inpcb, to A.
	is my understanding correct?

	the shortest workaround is to have something like below (from in6_pcb.c
	on netbsd-current):

>	for (/* all inpcb */) {
>		if (do_rtchange) {
>			/*
>			 * Since a non-connected PCB might have a cached route,
>			 * we always call in6_rtchange without matching
>			 * the PCB to the src/dst pair.
>			 *
>			 * XXX: we assume in6_rtchange does not free the PCB.
>			 */
>			if (IN6_ARE_ADDR_EQUAL(&in6p->in6p_route.ro_dst.sin6_addr,
>					       &faddr6))
>				in6_rtchange(in6p, errno);
>
>			if (notify == in6_rtchange)
>				continue; /* there's nothing to do any more */
>		}
>	}

	however, i'm not really sure if this is the right thing to do
	(it solves the problem, but not sure if it is the best solution).
	if look closer, you will notice that there's a difference between
	netbsd/openbsd and freebsd/bsdi.
	in short, the difference is like this:
	- freebsd/bsdi creates cloned route (RTF_HOST) very frequently.
	  for example, in_pcbconnect will create a cloned route.
	  netbsd/openbsd does not do this.

	freebsd/bsdi approach has good things, and bad things.
	good thing:
	- ensures that inpcb.inp_route is a host route, and makes it easier
	  for us to handle redirects (there's no need to flush inpcb.inp_route
	  on ICMP redirect).
	- we may be able to validate ICMP redirects/too bigs by using the
	  cloned route entries.  it will help us suppress possible remote DoS
	  attacks using ICMP
	bad things:
	-  (this is, IMHO, really bad) local DoS is very simple, like this:
		while (1) {
			sin.sin_addr.s_addr = random();
			sendto(s, &sin, sin.sin_len);
		}
	  it will overflow the routing table and kills network activity.
	  normal user can launch it easily.

	i believe netbsd/openbsd approach is better, since ICMP redirects
	are rare case (good thing 1 is bogus), and the DoS issue (bad thing 1)
	is real bad.  but netbsd/openbsd needs some improvements too, in the
	following domains:
	1. ICMP too big validation on sendto(2) cases.  the case is
	   lacking at this moment and is a obstacle to IPv6 operation.
	2. ICMP redirect validation.
	3. inpcb.inp_route refresh issue.

	what do people think?  there are too many possible solutions so i first
	need to clearify the problem...


	here are other possible ways to improve the behavior:
	- about (1) and (2), have a lowat/hiwat for # of host route entries
	  created by ICMP redirects/too bigs (no validation, make sure
	  there's no memory overflow).  i'm not sure what is the best
	  value for lowat/hiwat.  also i'm not sure about how to
	  pick a victim, and how it will behave under starvation cases.
	- about (3), don't use inpcb.inp_route for unconnected inpcb.

itojun