tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: ICMP_UNREACH_NEEDFRAG returns iface MTU instead of route?



Dave Huang <khym%azeotrope.org@localhost> writes:

> RFC 1191 says, "When a router is unable to forward a datagram because
> it exceeds the MTU of the next-hop network and its Don't Fragment bit
> is set, the router is required to return an ICMP Destination
> Unreachable message to the source of the datagram, with the Code
> indicating "fragmentation needed and DF set". To support the Path MTU
> Discovery technique specified in this memo, the router MUST include
> the MTU of that next-hop network in the low-order 16 bits of the ICMP
> header field that is labelled "unused" in the ICMP specification [7]."
>
> It seems reasonable to me to interpret the route's MTU as specifying
> "MTU of the next-hop network".

I disagree; there seems to be no notion in the standards that discovered
MTUs for routes are to be propagated.   The entire notion of "route MTU"
is just an implementation detail to store PMTU-D information.

A system of nodes that report discovered MTUs seems more complicated and
perhaps more fragile than one in which each host discovers MTUs itself.
But I can't prove the fragile part.

> I checked a Debian Linux system (running kernel 2.6.32-5-686), and it
> returns the route MTU. E.g., repeating the same type of test as I did
> earlier: ip route add 149.20.53.86 dev eth0 mtu 1200
>
> Then from another machine, pinged 149.20.53.86 with a 1300-byte packet
> and DF set. The MTU returned in the ICMP fragmentation needed packet
> was 1200.

That's an interesting datapoint about practice, but it strikes me as
non-compliant with standards (see below).

> NetBSD's current behavior would seem to break PMTU discovery... it
> won't forward DF packets larger than the route MTU, but then it tells
> the sender that larger packets are OK.

I agree that the combination of declining to forward a packet via a
route and returning an interface MTU greater than the route MTU is
broken.

The real question is:

  Why is it ok to decline to forward packets because they are bigger
  than the route MTU, when the route MTU is about PMTU-D to be used for
  locally-sourced packest?


If it is ok to decline, then we need to return route MTU when declining
to forward because of route MTU.  If it's not, we need to fix the
forwarding behavior.  I just skimmed RFC4821 and found zero discussion
of interaction with forwarding packets.  So I believe that route MTUs
(and really the entire routing table entry resulting from PMTU-D) should
be ignored when forwarding packets.  The RFC talks about storing PMTU-D
state using flow ids, essentially pointing out that PMTU-D values may
not necessarily be valid for dissimilar packets with the same
destination address.

That leaves routes with explicit MTUs that aren't from PMTU-D as an odd
case.  One can view those as buggy or test cases for PMTU-D on the
theory that the route MTU mechanism was added for PMTU-D.



Attachment: pgpqsZGr9QjXi.pgp
Description: PGP signature



Home | Main Index | Thread Index | Old Index