Subject: kern/7521: need arp_drain() routine to recover space from incomplete arp entries
To: None <gnats-bugs@gnats.netbsd.org>
From: Bill Sommerfeld <sommerfeld@orchard.arlington.ma.us>
List: netbsd-bugs
Date: 05/05/1999 17:05:51
>Number:         7521
>Category:       kern
>Synopsis:       need arp_drain() routine to recover space from incomplete arp entries
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people (Kernel Bug People)
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed May  5 17:05:00 1999
>Last-Modified:
>Originator:     Bill Sommerfeld
>Organization:
	
>Release:        19990501
>Environment:
	
System: NetBSD orchard.arlington.ma.us 1.4A NetBSD 1.4A (ORCHARDII) #60: Tue May 4 11:31:35 EDT 1999 sommerfeld@orchard.arlington.ma.us:/usr/src/sys/arch/i386/compile/ORCHARDII i386


>Description:
	A network managemnt station was connected to an ethernet with
	a /21 sized subnet (2048 addresses)

	It was configured to ping all addresses on that network, in roughly
	class-C sized chunks

	It ran out of mbufs and was sad.

>How-To-Repeat:
	use multiping to try to ping all addresses on a large empty subnet.
	(i haven't tried this yet, but..)
>Fix:
	Incomplete arp table entries hang on to one packet for the
	destination until the incomplete entry gets GC'ed.

	Held packets appear to sit around for up to arpt_prune seconds
	(5 minutes), or 2.5 minutes on average; this is at least
	theoretically bad for TCP, because (among other things) it's
	larger than 2*MSL...

	Add arp_drain() routine to recover the la_hold packets when
	memory is short.

	Prune la_hold packets out of incomplete arp entries more
	agressively.

	Provide sysctl interfaces to allow user to tune arp
	parameters.

standards issues:

Note that RFC826 (ARP) is extremely vague about timeouts.
RFC894 (IP over ethernet) only mentions arp as a good thing which
	should be used.

rfc1122 is also vague about exact timeouts for arp; it suggests that
"on the order of a minute" makes sense, but the timeouts should be
adjustable (and possibly set larger on larger networks); it also
suggests unicast verification of arp entries which might expire.

It says that arp SHOULD hang onto the single newest packet sent to
the destination of an incomplete entry; my reading is that this gives
us license to GC them when memory is short or after ~30s to a minute
(whichever comes first).
>Audit-Trail:
>Unformatted: