Subject: Re: Patch for Fast-IPsec over loopback
To: Sam Leffler <sam@errno.com>
From: Jonathan Stone <jonathan@DSG.Stanford.EDU>
List: tech-net
Date: 08/18/2003 12:40:21
Jason Thorpe asked for a root-cause analysis on the issues with a
(static-keyed, non-IKE) IPsec over lo0.  Here's my analysis of what
the kernel, without the patch I suggested, is doing:

1.  /sbin/ping generates an ICMP echo request, and sends it to a socket.

2.  Kernel handles outbound packet, applies IPsec processing.
    Fast-ipsec will place tags on the packet, first indicating that it needs IPsec;
    then (after the output callback completes) adds a PACEKET_TAG_IPSEC_OUT_DONE tag,
    showing that output-side IPsec has been done.

3.  ip_output() enqueues packet on lo0, which immediately routes the packet back up
    the networking stack, complete with tags.

4.  ICMP echo request packet arrives back in ip_input(). fast-ipsec
    will look for pacekt-tags indicating ipsec_input processing has been
    done. Those tags are not present, but the packet matches an existing
    policy, and the  packet is re-routed through the next-protocol (AH or ESP or what-have-you).

5. The input policy will perform IPsec processing and will (eventually) mark the packet
   as having PACKET_TAG_IPSEC_IN_DONE.

6. The packet is re-dispatched through the protocol stack and arrives  at icmp_input().
   icmp_input() processes the icmp echo request, changes the ICMP type to echo reply,
   and passes the received mbuf chain to icmp_reflect().

7.  icmp_reflect() reverses the IP addresses , updates options, and passes
    the re-written mbuf chain to ip_output.

8. At this point the outbound packet  chain is an ICMP echo reply, still bearing
   packet tags that both IPsec output processing *and* IPsec input processing have
   been completed.

Conclusion: I should commit the change to icmp_reflect() to strip all
packet tags, thus giving reflected ICMP packets the same semantics as
they would get from a userspace icmp-reflection daemon.  the other
packet semantics over lo0 should be re-tested and addressed separately.