tech-net archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: merging forwarding & packet filtering?



On 11 Mar 2011, at 00:34 , David Young wrote:
> On Thu, Mar 10, 2011 at 12:52:26PM +0800, Dennis Ferguson wrote:
>> Finally, though, there is the issue of what useful purpose this might
>> serve and whether there are other ways to get to the same place.  I'm
>> not sure what the purpose of the example might be, but let me just assume
>> that it is a method for doing something useful when you have two
>> working default routes and want to split traffic between them.
> 
> It's a method for achieving the best possible Internet reliability at a
> site that connects to two or more Internet providers on consumer-class
> subscriber lines---i.e., BGP is not available---and the computers at
> the site connect to the Internet through a NAT router.  When the link
> to provider A goes down, you don't know ahead of time for how long.
> It is helpful to direct new flows to provider B during an outage of
> provider A, however, redirecting existing flows to provider B during an
> outage is unhelpful at best.  At worst, it kills the flows[1].  If the
> outage lasts just 10 seconds, and switching providers kills flows, then
> reliability may be worse than if you did not fail over to B all.  The
> best possible thing to do is to hold existing flows on provider A and
> to let new flows start on provider B.  I haven't found a way to do that
> without keeping some flow state.

Ah, I had a feeling this would end up having something to do with NAT.
What you are doing is sort of like a special case of NAT, call it NAT
Ultra-Lite, where you aren't doing the NAT operation itself but need
to behave in a way which mimics the behaviour and constraints of the
downstream routers which are.  All things related to NAT are (necessary)
evil, and inherently require one to keep flow state.

What I would object to isn't the need to do it, but rather how and where
you want to do it.  My personal indicator that a function is being
implemented in the wrong spot, or is being thought of the wrong way,
is when it seems to require unnatural acts to get it to work correctly
in all cases.  In this case those "wrong spot" alarms are going off
all over: a correctly implemented flow state table inherently requires
a packet reassembly stage in front of it, so that fragmented packets are
made whole before the flow lookup, since you can't do a flow lookup on a
not-first fragment and it is only by getting those not-first fragments
attached to their first fragment before you look up the flow that you end
up with everything that needs to going in the right direction.  It may
be that fragmented packets are uncommon (or never happen, even) in many
situations, but not dealing with them isn't "right" even if it might work
in a lot of cases.  I just can't see how this can be made "right" the way
you want to do it.

I'll just point out that if you had to repair this, both for your NAT
Ultra-Lite or for full NAT, you would probably end up with something that
looks like this:

    <forwarding/policy>--><reassembly>--><flow lookup>--><create state/do 
stuff/send packet>

That is, you would use a (stateless) forwarding/policy lookup to
identify those packets that need to be processed through the flow table,
then reassemble the fragments (a null function for not-fragmented packets)
and only then do the flow state lookup.  This suggests that <forwarding/policy>
and <flow lookup> probably need to be separate.

Now let me cut-and-paste the same thing with a slightly different set of
operations on the right:

    <forwarding/policy>--><reassembly>--><flow lookup>--><L4(e.g. 
TCP)>--><socket buffer>

What that is describing is the function of the input side of a host
networking stack of the kind that needs to exist in all kernels.  In
this case (stateless) <forwarding/policy> picks out the packets arrive
which need the flow lookup by observing that the destination address in
the packet is a "local" one, the packets are reassembled if necessary,
and then a flow lookup is done to find the right transport/raw protocol
machinery and connection state to process the packet data out to the
application's socket.  The <flow lookup> data structure required here
is a slightly more general one than you maybe need, since it needs to
do partial, as well as full, matches on the 5-tuple (think, say, a
service socket with only a protocol or protocol and port number binding),
but would otherwise provide the exact service that NAT, and NAT Ultra-Lite,
need to do their work too.

Of course the current host stack implementation doesn't really have
anything that fits cleanly into the <flow lookup> spot; while it eventually
does the equivalent operation it does it on sort of an ad hoc, special
case basis that spreads the code around.  Like the destination address
matching done 6 ways that I ranted about before, however, this represents
a problem for making the code SMP-safe and lockless for readers since it
requires chasing through all the special cases to make sure they work.
A better arrangement would consolidate all of this into one, single lookup
structure that would sit in the <flow lookup> spot, with entries roughly
corresponding to open sockets of any description, since then you only have
one data structure (a "flow state" lookup) to make SMP-safe and fast.

What this means is that if you have to have a "flow state" lookup to do
a different function, like NAT or NAT Ultra-Lite, the best way to do that
would be to just reuse the same code, and a different instance of the same
data structure, that the host stack is going to require anyway.  All flow state
problems should use the same solution, since then you only need one solution.

Or, to put it the other way around, any solution to a "flow state" problem
which doesn't get you closer to the integrated connection lookup the host
stack needs is just adding more special-case-of-the-same-thing code that will
need to be dragged around forever.  I prefer an approach that tries to get more
done with less code, even if that makes the immediate special-case problem
a little harder to get done.

Dennis Ferguson


Home | Main Index | Thread Index | Old Index