Subject: Re: pppoe problem
To: None <tech-net@netbsd.org>
From: David Laight <David.Laight@btinternet.com>
List: tech-net
Date: 12/17/2001 10:26:53
----- Original Message -----
From: der Mouse <mouse@Rodents.Montreal.QC.CA>
To: <tech-net@netbsd.org>
Sent: Monday, December 17, 2001 1:45 AM
Subject: Re: pppoe problem


> >>> [...PMTU-D black holes...]
> >> I know of at least one person who's had trouble sending me large
> >> emails because the relevant outgoing mailhost has the won't-frag
> >> disease, [...]
> > I wonder if it possible to locate the 'black hole' by sending large
> > ICMP echo packets with 'dont fragment' and a small 'hop count',
> > comparing the result to the usual 'traceroute' sequence.
>
> I doubt it.

Yes - I clearly didn't think for quite long enough....
You can find GMTU by probing from the 'server' end (actually from the end
sending the big packets - since the mess is symetric). But finding GICMP
is, as you say, harder.

Interestingly the rulesets 'published' for setting up ipf will block
icmp 'need to fragment' messages...

    David
>
> For ease of writing, I'll use the following names:
> Server: the machine (webserver, mailhost, ec) that wants to
> send large packets.
> Client: the machine that wants to receive what Server sends.
> GMTU: the gateway in the Server->Client direction that has to
> drop packets when DF is set.
> GICMP: the machine responsible for dropping need-to-frag ICMP
> unreachables in the Client->Server direction.
>
> The problem arises only by (unwitting) collaboration of all these
> machines.  The real problem, of course, lies with GICMP (which in one
> case I know of proved to be the same host as Server - it had broken
> filtering rules installed).  But it doesn't show up unless GMTU is
> present, which is why it's a problem at all - GMTU doesn't exist for
> most users; for them, the MTU of Server's outgoing link is no larger
> than the MTU the rest of the way to Client.  (Server and Client are
> involved because the problem doesn't arise until Server wants to send
> bulk data to someone like Client on the far side of GMTU.)
>
> Now, from Server, you can locate GMTU with traceroute -P or moral
> equivalent.  But that won't tell you anything about where GICMP, the
> real problem, is - and if GICMP is present, traceroute -P will not
> actually "work", in that it won't be able to work out the real path
> MTU; it will just star out ("* * *") when it gets to GMTU.
>
> And from Client, your suggestion may locate the place on the
> Client->Server path (if there is one) where MTU is lowered.  This may
> or may not be the reverse direction of the link GMTU is at the
> Server-side end of (in the common PPPoE case it will be, but the path
> does not have to be symmetric).  And it does not necessarily have
> anything to do with where GICMP is - in the common case it won't.
>
> If GICMP is simply dropping _all_ ICMPs, you can locate it by doing a
> traceroute in ICMP mode, or doing a normal traceroute and then pinging
> successive hosts on it.  This will be defeated by a gateway closer to
> Client that drops pings but not need-frag unreachables, and if GICMP
> lets pings through it won't work either.
>
> And you can't find GICMP by forging need-to-frag ICMPs with low TTLs,
> because the routers "MUST NOT" (RFC 1122) send an ICMP in response to
> an ICMP unreachable (or certain other types of ICMPs).  (The reason
> traceroute -I works is that echo-request is not one of those "certain
> other" types.)
>
> Thus, I think there is no way to find GICMP, even with collaboration of
> Server's admins, unless it happens to be part of Server's organization
> (and thus Server's admins are also GICMP's admins).  Sniffing traffic
> on the various hops will, of course, locate it, but that requires the
> ability to so sniff.  Fortunately, in the few cases where I've been
> able to identify GICMP, it's always been part of Server's organization,
> and despite having a very small sample size, I would hazard a guess
> that it is so for the majority of PMTU-D black holes in today's
> Internet.
>
> The reason having GMTU (or more often the host on Client's end of the
> link to GMTU, which I'll call GCMTU) do MSS clamping is that it makes
> Server start out using packets small enough that GMTU doesn't have to
> fragment-or-drop them; it doesn't actually fix the real problem, though
> from a na´ve Client's perspective it may seem like it.  It does,
> however, depend on GCMTU knowing what GMTU's outgoing MTU is; I suspect
> existing MSS clamping implementations simply assume that it matches
> GCMTU's outgoing MTU over that link.  (It also depends on Server paying
> attention to the MSS option, and won't work in the presence of IPSEC
> AH, never mind ESP....)
>
> /~\ The ASCII der Mouse
> \ / Ribbon Campaign
>  X  Against HTML        mouse@rodents.montreal.qc.ca
> / \ Email!      7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B