David Young wrote:
MPLS decap/encap appears to be intricately entwined with ether_output, ip_output, ip_input, et cetera. That doesn't seem right. Instead, I think that there should be a pseudo-interface, mpls0, whose input/output routines do decap/encap, respectively. This de-clutters the IPv4/IPv6 stacks and the ethernet code, and it provides a location for tapping packets with tcpdump before encapsulation and after decapsulation.
Hi,I had an idea at one moment to create an pseudo-interface for every neighbor and route packets through those pseudo-interfaces. Also one single interface was a pre-option. But I don't think this will be very intuitive and I didn't see any vendor reporting something similar to this so I assume no one does it.
Also this raises another problem: I could do this if NetBSD would have a clear difference between control and forwarding but this is not the case and I don't want to change the ifa/p for a route without a very strong reason.
Btw, tcpdump decapsulates the MPLS frames and reports the inner IP packets generating an output like this:
13:32:20.880484 MPLS (label 20, exp 0, [S], ttl 64)IP (tos 0x0, ttl 64, id 17838, offset 0, flags [none], proto UDP (17), length 71) 193.28.151.120.50013 > 193.28.151.5.53: [udp sum ok] 29453+ PTR? 2.116.208.68.in-addr.arpa. (43)
Are you interested in catching IP packets before shim push/pop for some other reasons ?
There are several fragments of code like this, if (m->m_len < sizeof (struct ip) && (m = m_pullup(m, sizeof(struct ip))) == NULL) return ENOBUFS; that should be written like this, if (M_UNWRITABLE(m, sizeof(struct ip)) && (m = m_pullup(m, sizeof(struct ip))) == NULL) return ENOBUFS; instead.
I'm not sure about that. True, I should check M_READONLY but in a mandatory way where I actually want to write data into that mbuf. But should I check if it's readonly for cases like the m_push_inet{6}() where I prepend and modify only the prepended data ?
Dave
-- Thanks, Mihai