Thursday, February 28, 2013

IPSec Racoon2

<!--[if !mso]> <![endif] -->

Contents




IPSec: NETKEY and Racoon2

Introduction


IPSec was developed by IETF, as part of RFC 2401-2412 in 1998. IPSec creates a secure Control Tunnel by using a handshake protocol called Internet Key Exchange (IKE), that authenticates the end points of the tunnel to each other, and then IKE in turn creates Data Tunnels that use symmetric encryption between the end points.

There are 2 parts of the IPSec protocol in an implementation: the IKEv2 Control protocol, e.g. Racoon2 which resides in User space, and a Data Path, e.g. NETKEY that is in the Kernel space, that uses the keys exchanged by the Control protocol. Both of these parts are described in this article.

The racoon2 uses OpenSSL [15] library for its cryptographic operation.

The iproute2 package provides CLI commands “ip xfrm state” and “ip xfrm policy” to request detailed information about the IPsec SAs and policies installed in the kernel. The -s option displays additional information like the number of transmitted or invalid messages. The setkey command from the ipsec-tools package also provides similar information.


 

NETKEY

The NETKEY Kernel stack is based on XFRM and “Stackable Destination”. An important structure in the IPSec processing in the kernel is dst_entry.

The data structure dst_entry is used to store the protocol-independent information of cached routes. The IP route table rtable is a wrapper for dst_entry, and it stores protocol dependent info and the result of a routing lookup. The sk_buff struct points to the dst_entry structure.

dst_entry is used to assign a destination to a packet, which could be an external host machine or an internal packet handler. For external, the dst_entry structure includes the interface to the neighbor struct, IPSec transformers, etc. A neighbour struct represents a computer that is reachable via Layer-2.
<![if !supportLists]>·        <![endif]>dst->neightbour->hh_output() is invoked for packets to a destination present in the layer-2
<![if !supportLists]>·        <![endif]>dst->neighbour->output() is used for network devices without layer-2 header cache
If the ARP entry is valid, then the pointers normally point to dev_queue_xmit(), else it points to neigh_resolve_output()

In the usual case, there is only one dst_entry for every skb. With IPSec, there is a linked list of dst_entries, created using the “child” pointer, and only the last one is for routing; all other dst_entries are for IPSec transformers ; these other dst_entries have the DST_NOHASH flag set, to indicate that this entry is not part of the routing cache. Each of the dst_entries has a “path” pointer, that points to the last dst entry.

Some basic objects used by the XFRM architecture are (Ref: include/net/xrm.h)
<![if !supportLists]>·        <![endif]>policy rule, struct xfrm_policy (SPD entry)
<![if !supportLists]>·        <![endif]>bundle of transformations (dst_entry), struct xfrm_dst (SA bundle)
<![if !supportLists]>·        <![endif]>instance of a transformer, struct xfrm_state (SA)
<![if !supportLists]>·        <![endif]>template to clone xfrm_state, struct xfrm_tmpl

struct dst_entry
<![if !supportLists]>·        <![endif]>struct xfrm_state *xfrm; // represents IPSec Security Association (IPSec SA)
<![if !supportLists]>o   <![endif]>struct props – defines parameters for this SA (mode, aalgo, ealgo, calgo etc)
<![if !supportLists]>o   <![endif]>struct dst_ops, allows higher-layer protocols to run protocol-specific functions that manipulate the entries. This provides the interface between the protocol-independent cache and L3 protocols that use the routing cache.
<![if !supportLists]>§  <![endif]>*neigh_lookup()->output
<![if !supportLists]>o   <![endif]>xfrm_mode – defines xfrm_tunnel_mode, xfrm_transport_mode etc. xfrm_state_afinfo is a struct added by a patch [6, to have a convenient to have a pointer from xfrm_state to address-specific functions such as the output function for a family. e..g esp_type gets registered during esp_init at this point.
<![if !supportLists]>§  <![endif]>struct xfrm_state_afinfo *afinfo;
<![if !supportLists]>·        <![endif]>struct xfrm_type *type_map[IPPROTO_MAX]
<![if !supportLists]>o   <![endif]>xfrm_type – defines ah_type, esp_type, ipcomp_type, ipip_type
<![if !supportLists]>·        <![endif]>(*input)(struct sk_buff *);
<![if !supportLists]>·        <![endif]>(*output)(struct sk_buff *);
<![if !supportLists]>·        <![endif]>struct dst_entry *child;
<![if !supportLists]>·        <![endif]>struct net_device *dev;

Routing Cache and dst_entry relationships [11]
All routes are kept in the routing tables, called FIB (Forwarding Information Base). This is built by Routing Daemons and CLIs from administrators. Implemented using Trie data structure (LC-Trie). Routing is through many means, e.g. source address, TOS, firewall marking, etc. The routing table holds routes based on Destination address.

To enable faster lookup, a route cache is maintained. One can see its contents by the CLI command “ip -s route show cache”. The Hash Table used for Route Cache is keyed on DA, SA, TOS, Firewall MARK and Interface pointer. This is created on demand as pkts flow into and out of the router. The data structure used is Jenkins Hash, with a random number as salt, that is recomputed every 10 min. [13, 14].

The route cache has been removed from the latest 3.X kernels.



struct xfrm_dst
xfrm_dst is a data structure that contains a bundle of transformations that are cached by xfrm_policy structure [5]. A pointer to skb->refdst entry can be type-casted to xfrm_dst, since the first struct in xfrm_dst is of type dst_entry.
<![if !supportLists]>·        <![endif]>struct dst_entry dst;
<![if !supportLists]>·        <![endif]>struct xfrm_policy *policy
<![if !supportLists]>o   <![endif]>xfrm_tmpl xfrm_vec[]
<![if !supportLists]>o   <![endif]>u8 type, action
<![if !supportLists]>·        <![endif]>int num_pols; num_xfrms
xfrm_policy represents IPSec Policy. xfrm_dst “Transform Bundles” are cached at xrfm_policy struct (field ->bundles). Stackable destination:
dst -. xfrm  .-> xfrm_state #1
 |---. child .-> dst -. xfrm .-> xfrm_state #2
                  |---. child .-> dst -. xfrm .-> xfrm_state #3
                                   |---. child .-> NULL
 
The following figure showing structure relationships is taken from [5].

 




High Level Data Structures Interactions.


Control Path

 




Data Path



Transform Control Interface (User/Kernel)

ipsec_pf_key_init and xfrm_user_init modules are implemented as a Kernel Modules in
net/xfrm/xfrm_user.c and
net/key/af_key.c
and provide routines for send/recv vectors between User and Kernel space.

XFRM Infrastructure Init

With IPSec enabled, the IP Route function ip_rt_init() calls xfrm_init() and xfrm4_init() to initialize the policy functions that are called in the output path.
net/xfrm/xfrm_policy.c/xfrm_init
register_pernet_subsys(&xfrm_net_ops)



xfrm_input_init()
registers xfrm_net_init/exit functions



xfrm_net_init()
xfrm_state_init()



xfrm_policy_init()



xfrm_dst_ops_init



xfrm_sysctl_init


xfrm_input_init()
Creates a L1 cache called secpath_cache, that holds xfrm_states.
net/ipv4/xfrm4_policy.c/xfrm4_init
dst_entries_init(&xfrm_dst_ops)



xfrm4_state_init()



xfrm4_policy_init()



register_net_sysctl – net/ipv4
Register /proc variables






IPSec Output Processing

Some of the following notes are taken from [4]
net/ipv4/ip_output.c/ip_queue_xmit() is called from upper layer protocol like TCP.
ip_route_output_flow() called by ip_route_output_ports() is the central route lookup function which is responsible for route lookup, and creating the stackable destination in dst.
<![if !supportLists]>·        <![endif]>This function calls net/xfrm/xfrm_policy.c/xfrm_lookup() to access the SPD via xfrm_sk_policy_lookup(). This calls xfrm_resolve_and_create_bundle.
<![if !supportLists]>·        <![endif]>xfrm_lookup (net/xfrm/xfrm_policy.c) finds a match in the SPD and then checks the action in the xfrm_policy entry. If the action is XFRM_POLICY_BLOCK, it returns an error. If the action is XFRM_POLICY_ALLOW, then a "bundle" of destinations is accumulated and returned.
Set the skb->refdst to a dst_entry
ip_localout() is finally called after putting IP headers, which in turn calls dst_output()
This calls all transfer routines via output routines pointing to esp6_output() etc, and uses the xfrm_state which is also a pointer in the dst struct.
The last dst_entry of the destination stack is the final routing entry, with it's output function pointer (->output) set to ip_output The other dst_entry's on top (up to nx transformations, see the for-loop in __xfrm4_bundle_create) have set xfrm4_output, or xfrm6_output etc.
ip_output() finally calls NF_HOOK_COND, with NF_INET_POST_ROUTING

The return values from the functions are as follows:
<![if !supportLists]>·        <![endif]>xfrm4_output calls the appropriate transformation function (e.g. esp_output) esp_output returns 0 on success, else an error occurred.
<![if !supportLists]>·        <![endif]>xfrm4_output itself returns NET_XMIT_BYPASS in case of no error. If xfrm4_output would return 0, the for-loop in dst_ouput is left immediately with a return statement, after the output function was called.
<![if !supportLists]>·        <![endif]>So NET_XMIT_BYPASS keeps the destination stack running until the final call to ip_output makes it return to ip_queue_xmit, which returns with the result. The following diagrams are taken from [2].
dst_output() is called from xfrm_output_one, which is called from proto->output routines, e.g. esp_output().
dst = skb_dst_pop(skb);
if dst {
               x->outer_mode->output(x, skb);
               err = x->repl->overflow(x, skb);
               skb_dst_force(skb);
               err = x->type->output(x, skb);
               skb_dst_set(skb, dst); 
} while (dst->xfrm)


IPSec Input Processing

Iked receives a IKE_SA_INIT message. (notes from [2])
It searches remote by peer’s IP address and replies IKE_SA_INIT using algorithm in the remote.
It receives IKE_AUTH from the peer.
It validates the peer by information in the remote
It searches the selector by the Traffic Selector payload in the message.
It finds the selector and retrieves policy, ipsec and sa.
It processes the request and replies IKE_AUTH

Details: The whole de-encapsulation is done in xfrm4_rcv_encap, except in case of nat-traversal, where udp_rcv comes into play. dst_input is not used for xfrm transformations. The main function is ip_local_deliver_finish.
Register IPSec Headers in IPv6 with inet6_protos[], ie. upper layer protocol handler array.

ESP Module
net/ipv4/esp4.c/esp4_init() gets called as part of Module init. The init function calls
xfrm_register_type(&esp_type) and inet_add_protocol(&esp4_protocol)
<![if !supportLists]>·        <![endif]>esp_type = { esp_init_state, esp_destroy, esp_input, esp_output….}
<![if !supportLists]>·        <![endif]>esp4_protocol = { xfrm4_rcv, esp4_err }
xfrm_rcv ultimately calls xfrm_input

The packet enters as ESP or AH and thus the protocol handler is xfrm4_rcv which is called via ipprot->handler(skb).
xfrm4_rcv calls xfrm4_rcv_spi which calls xfrm_rcv_encap, finally calling xfrm_input(), which does all de-encapsulating transformations in a do-while-loop, calling x->type-input. xfrm_state{} pointer is kept in sec_path{} in sk_buff{}
After processing IPsec, the kernel call xfrm_policy_check() at entrance of upper layer process. xfrm_state pointer is kept in sec_path{}. When all transformation processing is done the ip paket is put back into the CPU specific packet queue by netif-rx.

NAT Traversal (with UDP)

INPUT:
- udp_rcv
|
--- udp_queue_rcv_skb
|
--- udp_encap_rcv (strips of udp)
|
--- xfrm4_rcv_encap (usual procedure)
OUTPUT: - Done in esp_output as part of a stacked destination (e.g. net/ipv4/esp4.c)

RACOON2


The following code summary for ikev2 implementation in Linux is from the Racoon2 project [1]. Racoon2, is a successor of Racoon, which was developed by the KAME project. It supports IKEv1, IKEv2, and KINK protocols. It works on FreeBSD, NetBSD, Linux, and Mac OS X. Racoon2 is provided under a BSD-style license. The control protocol of IPSec used in racoon2 is IKEv2 and is explained in the RFC 5996. [9]

Racoon2 implements IKEv2, and it’s aim is to establish and maintain shared security parameters and authenticated keys between two IPSec end points.

The main()

iked/main.c/main()
parse options



iked/crypto_openssl.c/eay_init()
ERR_load_crypto_strings()



OpenSSL_add_all_algorithms()



ENGINE_load_built_engines()



ENGINE_register_all_complete()


lib/cfsetiup.c/rcf_read()
reads config file and creates data structures


sched_init()



iked/ike_pfkey.c /sadb_init()
Initialize sadb_request_list_head




get pfkey_socket value and set ike_rcpfk_callback() pointer with call to lib/if_pfkeyv2.c/rcpfk_init()
open PF_KEY i/f with rcpfk_open()



Register with kernel, and set cb = ike_rcpfk_callback[] structure that has pointers to sadb_xxx_callback routines.

iked/ike_conf.c/ike_conf_check_consistency()
Data Structure checks:
Check rcf_default_head



Check rcf_remote_head



Check Selector section



iked/ike_spmif.c/ike_spmif_init()
spmif_socket = spmif_init()
lib/if_spmd.c/open_spmif_local()
fd = socket(PF_UNX, SOCK_STREAM, 0)
job_initqueue(&spmifh)
login_spmif(fd)

isakmp_init()
oakley_dhinit()
init DH parameters


ikev2_init()
ikev2_sa_init()



ikev2_cookie_init()



ikev2_periodic_task() – cookie_refresh()


isakmp_open()
open sockets to dst address

iked_pidfile_create()



iked_mainloop()







Interface to SPMIF is using Unix Domain Sockets. A structure of type spmifh is created, and initialized as follows:
struct spmif_job {
enum job_type {
JOB_CANCELED, JOB_POLICY_ADD, JOB_POLICY_DELETE,
JOB_FQDN_QUERY, JOB_SLID, JOB_MIGRATE
} type;
union {
int (*generic)();
int (*policy_add)(void *, int);
int (*policy_delete)(void *, int);
int (*fqdn_query)(void *, const char *);
int (*slid)(void *, const char *);
int (*migrate)(void *, int);
} callback;
void *tag;

int fd;
char buf[200];

/* job queue */
struct spmif_job *next;
};

struct spmif_handle {
struct linereader *lr;
struct spmif_job *job_head;
struct spmif_job **job_tailp;
};
spmif_handle->job_head = NULL
spmif_handle->job_tailp = & spmif_handle->job_head

Ikev2 Main Loop

This is the main loop for the ikev2 daemon. It waits on “select” and executes the FSM based on packets coming in from
<![if !supportLists]>·         <![endif]>SPMD,
<![if !supportLists]>·         <![endif]>PF_KEY and
<![if !supportLists]>·         <![endif]>Network.
iked/main.c/iked_mainloop()




FD_ZERO(&fdset)



isakmp_fdset



Add spmif_fd to FDSET
spmif_fd = ike_spmif_socket()



select()



check data on sadb_socket, if true, call sadb_poll()
iked/ikepfkey.c/rcpfk_handler()
Recv PFKEY meg, using rcpfk_recv()
Call handler – rcpfk_msg[sadb_msg_type].recvfunc()
e.g. for SADB_ACQUIRE msg from kernel, it calls rcpfk_recv_acquire()


check data on spmif_fd, if true, call iked/ike_spmif.c/ike_spmif_poll()
spmif_handler(spmif_socket)
Call handlers that execute job, such as, JOB_POLCIY_ADD/DELETE, JOB_FQDN_QUERY etc.

For all other FDs
isakmp_handler(isakmp_sock)
This is the only places of entry, ie. From isakmp_handler()
Process ikev2 pkts, by calling ikev2_input()


The following sections provide detail on the 3 main handlers for ikev2 dameon, i.e.
<![if !supportLists]>1)      <![endif]>sadb_poll
<![if !supportLists]>2)      <![endif]>ike_spmif_poll
<![if !supportLists]>3)      <![endif]>isakmp_handler

sadb_poll()

lib/if_pfkeyv2.c/sadb_poll() - Has callbacks for each message from Kernel on PF_KEY socket. This calls functions in rcpfk_msg[] first and then this in turn calls functions in ike_rcpfk_callback[].
static struct pfkey_msgtype {
char *name;
int (*recvfunc) (caddr_t *, struct rcpfk_msg *);
} rcpfk_msg[] = {
{ "GETSPI", rcpfk_recv_getspi, },
{ "UPDATE", rcpfk_recv_update, },
{ "ADD", rcpfk_recv_add, },
{ "DELETE", rcpfk_recv_delete, },
{ "GET", rcpfk_recv_get, },
{ "ACQUIRE", rcpfk_recv_acquire, },
{ "REGISTER", rcpfk_recv_register, },
{ "EXPIRE", rcpfk_recv_expire, },
{ "X_SPDUPDATE", rcpfk_recv_spdupdate, },
{ "X_SPDADD", rcpfk_recv_spdadd, },
{ "X_SPDDELETE", rcpfk_recv_spddelete, },
{ "X_SPDGET", rcpfk_recv_spdget, },
{ "X_SPDDUMP", rcpfk_recv_spddump, },
{ "X_SPDEXPIRE", rcpfk_recv_spdexpire, },
{ "X_SPDDELETE2", rcpfk_recv_spddelete2, },
{ "X_MIGRATE", rcpfk_recv_migrate, }
}

Iked/ike_pfkey.c
static struct rcpfk_cb ike_rcpfk_callback = {
sadb_getspi_callback,
sadb_update_callback,
sadb_expire_callback,
sadb_acquire_callback,
sadb_delete_callback,
sadb_get_callback,
};

ike_spmif_poll()

iked/ike_spmif.c/spmif_poll() has the following switch() statements.
case JOB_POLICY_ADD: parserep_policy_add(job, lr->lines, nline);
case JOB_POLICY_DELETE: parserep_policy_delete(job, lr->lines, nline);
case JOB_FQDN_QUERY: parserep_fqdn_query(job, lr->lines, nline);
case JOB_SLID: parserep_slid(job, lr->lines, nline);
case JOB_MIGRATE: parserep_migrate(job, lr->lines, nline);

 

isakmp_handler()

iked/iksamp.c/isakmp_handler() calls iked/ikev2.c/ikev2_input() that implements the Ikev2 State Machine
IKEV2INPUT ikev2_input_dispatch[] = {
responder_state0_recv, /* Responder idling state */
initiator_ike_sa_init_recv, /* Initiator IKE_SA_INIT sent */
responder_ike_sa_auth_recv0, /* Responder IKE_SA_INIT sent */
initiator_ike_sa_auth_recv0, /* Initiator IKE_SA_AUTH sent */
responder_ike_sa_auth_recv, /* Responder IKE_SA_AUTH received */
initiator_ike_sa_auth_recv, /* Initiator IKE_SA_AUTH received */
ikev2_established_recv, /* should be CREATE_CHILD_SA or INFORMATIONAL */
ikev2_dying_recv, /* same as established, except no initiating */
ikev2_dead_recv,
};
iked/ikev2_impl.h
enum ikev2_state {
IKEV2_STATE_IDLING = 0,
IKEV2_STATE_INI_IKE_SA_INIT_SENT = 1,
IKEV2_STATE_RES_IKE_SA_INIT_SENT = 2,
IKEV2_STATE_INI_IKE_AUTH_SENT = 3,
IKEV2_STATE_RES_IKE_AUTH_RCVD = 4,
IKEV2_STATE_INI_IKE_AUTH_RCVD = 5,
IKEV2_STATE_ESTABLISHED = 6,
IKEV2_STATE_DYING = 7,
IKEV2_STATE_DEAD = 8
/* IKEV2_STATE_EAP = 9 */
/* IKEV2_STATE_ESTABLISHED_WAIT_INITIATOR, */
};
enum ikev2_child_state {
IKEV2_CHILD_STATE_IDLING = 0,
IKEV2_CHILD_STATE_GETSPI,
IKEV2_CHILD_STATE_GETSPI_DONE,
IKEV2_CHILD_STATE_WAIT_RESPONSE,
IKEV2_CHILD_STATE_MATURE,
IKEV2_CHILD_STATE_EXPIRED, /* XXX STATE_DONE */
IKEV2_CHILD_STATE_REQUEST_PENDING,
IKEV2_CHILD_STATE_REQUEST_SENT,
IKEV2_CHILD_STATE_NUM,
IKEV2_CHILD_STATE_INVALID /* to indicate invalid state */
};
<![if !supportMisalignedColumns]> <![endif]>
iked/ikev2.c/ikev2_input()




ikev2_check_payloads()



get message_id from packet



ikev2_find_sa()



if ike_sa NOT found
ikev2_conf_find(remote)




if no config, use default remote config


create ikev2 as responder



ikev2_create_sa



call pkt processing routine
(*ikev2_input_dispatch[ike_sa->state](ike_sa, pkt, remote, local)








Configuration

All configurations for racoon2 daemons are in raccon2.conf file.
The following picture shows the relativeness between each directive. The following diagram is taken from [3].


+---(selector_index)--- remote
| ^
| |
| (remote_index) +-(sa_index)-> sa
v | |
selector -+ | +-(ipsec_index)-> ipsec -+-(sa_index)-> sa
| | |
selector -+-(policy_index)-> policy -+-(ipsec_index)-> ipsec ---(sa_index)-> sa
| |
selector -+ +-(ipsec_index)-> ipsec ...


struct rcf_selector {
rc_vchar_t *sl_index;
int order; rc_type direction;
struct rc_addrlist *src, *dst ;
int upper_layer_protocol;
int next_header_including;
rc_vchar_t *tagged;
int reqid;
struct rcf_policy *pl;
struct rcf_selector *next;
};
struct rcf_default {
struct rcf_remote *remote;
struct rcf_policy *policy;
struct rcf_ipsec *ipsec;
struct rcf_sa *sa;
};
SPMD manages the SPD. IKED interacts with SAD. SPMD reads the configuration file for selector, policy and ipsec parameters and installs entries into SPD using PF_KEY. The kernel returns a SPID (IP Sec Policy ID) to the SPMD, which then waits for queries from IKED. IKED waits for PF_KEY SADB_ACQUIRE msg or UDP SA_INIT message from the peer. SADB_ACQUIRE message contains SPID. IKED requests SPMD using the SPID to get the Selector ID, and then using this ID, retrieves other parameters. IKED can also add a policy in the SPD in Kernel, via spmif_post_policy_add() via the SPMD.
Basic configuration for Tunnel Mode
cd /usr/local/racoon2/etc/racoon2
# cp vals.conf.sample vals.conf
Change Local and Remote IP addersses for Transport/Tunnel mode

Generate pre-shared key using pskgen
# cd /usr/local/racoon2/etc/racoon2/psk
# pskgen -r -o test.psk
# od test.psk
0000000 106307 153425 117435 137575 023026 044255 150547 166447
0000020

# cp racoon2.conf.sample racoon2.conf
Uncomment the Tunnel mode IKEv2 or IKEv1 (initiator and responder)
include "/usr/local/racoon2/etc/racoon2/tunnel_ike.conf";

# cp default.conf.sample default.conf

# cp tunnel_ike.conf.sample tunnel_ike.conf

Start spmd and iked
/usr/local/racoon2/etc/racoon2/init.d/spmd start
/usr/local/racoon2/etc/racoon2/init.d/iked -ddd -F

1st Pkt and SADB_ACQUIRE from NETKEY to Racoon2

From [2]
The kernel sends a SADB_ACQUIRE message including IPsec policy ID to the key exchange daemons via PF_KEY socket.
Iked receives the message and get IPsec policy ID in the sadb_x_policy_id field.
Iked requests the identifier of selector corresponding to the IPsec policy ID to spmd.
Iked receives the selector identifier from spmd.
Iked searches selector by the identifier and retrieves policy, remote, ipsec, sa.
Iked validates the key exchange protocol in the remote.
Iked processes the SADB_ACQUIRE.

In more detail:
There is no entry in the SAD
Kernel broadcasts SADB_ACQUIRE to the PF_SOCKET listeners in user space
[ Note: Ikev2 via sadb_init() in the main function had opened a PF_KEY socket i/f with rcpfk_open(), and registered RCT_SATYPE_AH/ESP/IPCOMP with the kernel and set up callback routines in “cb” variable to sadb_xxx_callback() routines.
These routines are called, from rcpfk_msg[sadb_msg_type].recvfunc() functions when the ikev2 daemon receives a SADB_msg from the kernel ]
Data on PF_KEY socket wakes up “select” which in turn calls sadb_poll() to get the packet from PF_KEY via rcpfk_recv().
Routine rcpfk_msg[sadb_msg_type].recvfunc() is called, which happens to be rcpfk_recv_acquire() function to handle SADB_ACQUIRE message from the kernel, which in turn calls the function from the “cb”, called sadb_acquire_callback().
sadb_acquire_callback() calls iked/isakmp.c/isakmp_initiate(), which initiates the negotiation. If it fails it calls cb->acquire_error() with erro code.
isakmp_initiate() is called with sadb_initiator_request_method , which is initialized as follows.

What ikev2 daemon needs now is the selctor_index, so that it can make a SA_INIT packet
For this, isakmp_initiate() calls raccoon_malloc(), and sets the callback method as sadb_initiator_request_method.
It then calls iked/ike_spmif.c/ike_spmif_post_slid(), which calls lib/if_spmd.c/spmif_post_slid() as follows:

job->callback.slid = isakmp_initiate_cont()
job->callback_req=
job->tag= * isakmp_acquire_request
isakmp_acquire_request->callback_method = sadb_initiator_request_method
/* sadb_initiator_request_method used in response to SADB_ACQUIRE */
struct sadb_request_method sadb_initiator_request_method = {
sadb_getspi,
sadb_acquire_error,
sadb_update,
sadb_add,
sadb_delete,
sadb_get,
};
job->fd=Unix Socket
job->type=JOB_SLID
spmif_post_slid() posts a of type JOB_SLID to the Unix domain socket
The return value is sent over the same Unix Domain socket back to iked_mainloop(), where the select wakes up on ike_spmif_poll() and calls spmif_handler(). This finally calls lib/if_spmd.c/parserep_slid(), where the slid is taken and callback function is called with slid, i.e. iskamp_initiate_cont(). With the selctor pointer, we get the policy and remote structures and finally call ikev2_initiate(req, policy, selector, rm_info) to trigger the FSM.




Backup

BEET [7]

+Sending (inner IPv4, outer IPv4)(4-4)
+=====================================
+inet_sendmsg
+ raw_sendmsg
+ ip_route_output_flow
+ __ip_route_output_key
+ xfrm_lookup
+ flow_cache_lookup
+ xfrm_policy_lookup // lookup IPsec policy
+ xfrm_find_bundle // lookup IPsec SA
+ __xfrm_selector_match
+ xfrm_tmpl_resolve // only if bundle was not found!
+ xfrm_state_find
+ xfrm_bundle_create // create output (dst) chain if bundle was not
found
+ __xfrm4_bundle_create
+ ip_push_pending_frames
+ dst_output(skb) //this calls skb->dst->output();
+ xfrm4_output //This finally returns 4 (NET_XMIT_BYPASS) to
dst_output();
+ xfrm4_encap
+ esp_output
+ xfrm_beet_output //change the ip header to outer.
+ dst_output(skb)
+ ip_output
+ ip_finish_output Or ip_fragment //depending on size of packet
+ // Returns 0 to dst_output(); which makes dst_output to come
out of infinite loop.
+ dev_queue_xmit
+
+
+Receiving (inner IPv4, outer IPv4)(4-4)
+===========
+
+net_rx_action()
+e1000_clean() // dependent on network hardware
+e1000_clean_rx_irq()
+netif_receive_skb()
+ deliver_skb()
+ ret = pt_prev->func(skb, skb->dev, pt_prev);
+ ip_rcv()
+ nf_hook()
+ ip_rcv_finish()
+ ip_route_input()
+ dst_input()->ip_forward() or ip_input()
+ ip_input // remove the IPv4 header
+ ip_input_finish
+ ret = ipprot->handler(&skb, &nhoff);
+ xfrm4_rcv()
+ xfrm4_rcv_encap()
+ xfrm4_parse_spi()
+ xfrm_state_lookup() // lookup IPsec SA
+ xfrm_beet_input(skb, x) //To change to inner IP header.
+ nexthdr = x->type->input(x, xfrm.decap, skb) // ==
esp_input
+ esp_input() // process ESP based on inner
address
+ returns 0 ;
+ /* beet handling in xfrm_rcv_spi */
+ netif_rx()
+ // ip_input_finish returns 0
+ // netif_receive_skb returns 0
+netif_receive_skb //Now we have an IPv4 packet. So the input flow is
for v4 packet.
+ deliver_skb()
+ ret = pt_prev->func(skb, skb->dev, pt_prev);
+ ip_rcv()
+ nf_hook() //This calls ip_rcv_finish(skb)
+ ip_rcv_finish() //Here the skb->dst is NULL and so is filled for
the input side.
+ ip6_route_input()
+ dst_input()->ip_forward() or ip_input()
+ ip_input // remove the IPv4 header
+ ip_input_finish
+ ...
+ ...
+ ...
+

Output Pkt ping from IPSec Tunnel Node (showing debugs till SADB_ACQUIRE)

[root@localhost racoon2]# more vals.conf
### Tunnel Mode Settings ###
# Your Network Address or Host Address (host-to-host tunnel mode)
MY_NET "0.0.0.0/0";
# Peer's Network Address or Host Address (host-to-host tunnel mode)
PEERS_NET "0.0.0.0/0";

# Your SGW Address
MY_GWADDRESS "192.168.2.9";

# Peer's SGW Address
# You don't need to specify if you're IKE responder
# talking to an IKE initiator behind NAT.
PEERS_GWADDRESS "192.168.2.10";

[root@localhost racoon2]# more racoon2.conf
interface
{
ike {
192.168.2.9 port 500;


Computer-1 (IP address 192.168.2.9)
#ping 192.168.2.1

[root@localhost racoon2]# ip xfrm policy list
src 0.0.0.0/0 dst 0.0.0.0/0
dir 4 priority 0 ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
dir 3 priority 0 ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
dir fwd priority 0 ptype main
tmpl src 192.168.2.10 dst 192.168.2.9
proto esp reqid 0 mode tunnel
src 0.0.0.0/0 dst 0.0.0.0/0
dir in priority 0 ptype main
tmpl src 192.168.2.10 dst 192.168.2.9
proto esp reqid 0 mode tunnel
src 0.0.0.0/0 dst 0.0.0.0/0
dir out priority 0 ptype main
tmpl src 192.168.2.9 dst 192.168.2.10
proto esp reqid 0 mode tunnel

[root@localhost init.d]# iked -ddd -F
2013-02-18 01:07:43 [INFO]: main.c:433:main(): starting iked for racoon2 20100526a
2013-02-18 01:07:49 [DEBUG]: ike_pfkey.c:621:sadb_acquire_callback(): sadb_acquire_callback: seq=11 satype=96 sa_src=192.168.2.9[0] sa_dst=192.168.2.10[0] samode=82 selid=1921
2013-02-18 01:07:49 [DEBUG]: if_spmd.c:829: SLID ok: 250 ike_tun_sel_out
2013-02-18 01:07:49 [DEBUG]: ikev2.c:756:ikev2_initiate(): creating new ike_sa
2013-02-18 01:07:49 [DEBUG]: ike_sa.c:412:ikev2_allocate_sa(): ikev2_create_sa((nil), 192.168.2.9[500], 192.168.2.10[500], 0xa028660)
2013-02-18 01:07:49 [DEBUG]: ike_sa.c:415:ikev2_allocate_sa(): sa: 0xa029158
2013-02-18 01:07:49 [DEBUG]: ikev2.c:798:ikev2_initiate(): child_sa: 0xa0292f0
2013-02-18 01:07:49 [DEBUG]: ikev2_child.c:139:ikev2_child_state_set(): child_sa 0xa0292f0 state IDLING -> GETSPI
2013-02-18 01:07:49 [DEBUG]: ike_pfkey.c:269:sadb_getspi(): sadb_getspi: seq=11, satype=96
2013-02-18 01:07:49 [DEBUG]: ike_pfkey.c:459:sadb_getspi_callback(): sadb_getspi_callback: seq=11, spi=0x0e707d78, satype=96, sa_src=192.168.2.10[0], sa_dst=192.168.2.9[0]
2013-02-18 01:07:49 [DEBUG]: ikev2_child.c:139:ikev2_child_state_set(): child_sa 0xa0292f0 state GETSPI -> GETSPI_DONE
2013-02-18 01:07:49 [DEBUG]: ikev2_proposal.c:564:ikev2_pack_proposal_sub(): ikev2_pack_proposal_sub:
2013-02-18 01:07:49 [DEBUG]: ikev2_proposal.c:572:ikev2_pack_proposal_sub(): proposal #1:
2013-02-18 01:07:49 [DEBUG]: ikev2_proposal.c:564:ikev2_pack_proposal_sub(): ikev2_pack_proposal_sub:
2013-02-18 01:07:49 [DEBUG]: ikev2_proposal.c:572:ikev2_pack_proposal_sub(): proposal #1:
2013-02-18 01:07:49 [DEBUG]: ikev2_proposal.c:587:ikev2_pack_proposal_sub(): protocol 1 spi_size 0
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate(): compute DH's private.
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
7a5cd06a 01571ddd 9a95360e 7148fbf2 6807b1bf 43be3764 d347ef05 dccf87f3
60b18f22 4c37fb8a e273ed35 09e5c283 5c7a9792 1fff3eb4 2bceb986 ab478a33
d903cba5 3e0e9fec fd858ff8 ffeee4eb 3b4192e7 2f76b797 4f2be931 d731efd4
6071745d 2a80ccb4 a8c66aa1 76512e85 b7ef4594 a9f6bec9 bc302d36 f7c61261
bc9deaa6 50b6b2d6 f4f9513a 63428d61 d4faed4a e391a162 59e1a61f 5a95e487
03a9f725 b1ce18c6 42fd3492 d713675b 368dd958 ec07f64d 91e8e3f3 93184343
99d46a8c dcf05495 28431137 1845e0e4 8d5ccef6 cb585263 e923ab3d 96fd6e93
2caa4007 f7a9256e f3bf8776 bd380cff e112f784 ba05ec71 21119be0 10d467aa
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate(): compute DH's public.
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
6f3e65ea af4d1d39 96482cd8 26d54f84 61e6e303 c3b44645 d6136174 7ac5a5be
a7aa129e 752ef1b1 2714229a 8b1a16d3 06655816 616d9d65 3007a32d da61bd83
f0bdfc70 c7a8065f 8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e 9c51e219
3e043768 75990287 9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb dd1e9a13
13d94c7e 523d013e 94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c 27e36099
a8bf85cd 95d50730 c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c e8c27d95
f88b9a1c d84b7562 c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340 380e16b4
3381dd53 0b063de7 e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7 9766f80c
2013-02-18 01:07:50 [DEBUG]: ikev2_payload.c:719:ikev2_notify_payload(): ikev2_notify_payload(0, (nil), 0, 16388, 0xa0281e8, 20)
2013-02-18 01:07:50 [DEBUG]: ikev2_payload.c:719:ikev2_notify_payload(): ikev2_notify_payload(0, (nil), 0, 16389, 0xa027fe0, 20)
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:162:ikev2_packet_construct(): ikev2_packet_construct(34, 0x8, 0x0, 0xa029158, [0xa0240e8, 5])
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:170:ikev2_packet_construct(): payload 0 type 33 (SA) data 0xa023e48 len 80
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:170:ikev2_packet_construct(): payload 1 type 34 (KE) data 0xa0244b8 len 260
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:170:ikev2_packet_construct(): payload 2 type 40 (NONCE) data 0xa023fb8 len 32
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:170:ikev2_packet_construct(): payload 3 type 41 (NOTIFY) data 0xa0247a0 len 24
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:170:ikev2_packet_construct(): payload 4 type 41 (NOTIFY) data 0xa029128 len 24
2013-02-18 01:07:50 [DEBUG]: ikev2_packet.c:295:ikev2_packet_construct(): result 0xa024410
2013-02-18 01:07:50 [DEBUG]: ikev2.c:562:ikev2_transmit(): ikev2_transmit(0xa029158, 0xa024410) len 468
2013-02-18 01:07:50 [DEBUG]: isakmp.c:1678:isakmp_transmit_noretry(): transmit 0xa029218
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:324:sendfromto(): sockname 192.168.2.9[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:326:sendfromto(): send packet from 192.168.2.9[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:328:sendfromto(): send packet to 192.168.2.10[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:508:sendfromto(): 1 times of 468 bytes message will be sent to 192.168.2.10[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:512:sendfromto():
f7b1ad69 396db4ca 00000000 00000000 21202208 00000000 000001d4 22000054
00000050 01010008 0300000c 0100000c 800e00c0 0300000c 0100000c 800e0080
03000008 01000003 03000008 02000001 03000008 02000002 03000008 02000004
03000008 03000002 00000008 0400000e 28000108 000e0000 6f3e65ea af4d1d39
96482cd8 26d54f84 61e6e303 c3b44645 d6136174 7ac5a5be a7aa129e 752ef1b1
2714229a 8b1a16d3 06655816 616d9d65 3007a32d da61bd83 f0bdfc70 c7a8065f
8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e 9c51e219 3e043768 75990287
9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb dd1e9a13 13d94c7e 523d013e
94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c 27e36099 a8bf85cd 95d50730
c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c e8c27d95 f88b9a1c d84b7562
c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340 380e16b4 3381dd53 0b063de7
e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7 9766f80c 29000024 8a955a65
45c55381 de1f2719 d7dd07e8 3f72387b 36c8901c 5a993376 e818ac07 2900001c
00004004 b21aadcd b80ecf64 eb75bb09 d9a325fd bdbc18fa 0000001c 00004005
2960ca7b 9506f1be 125f1113 6a34609b a02d7226
2013-02-18 01:07:50 [DEBUG]: isakmp.c:1656:isakmp_transmit(): sched 0xa023ff0
2013-02-18 01:07:52 [DEBUG]: ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:52 [DEBUG]: ike_sa.c:230:ikev2_sa_periodic_task(): child_sa: 0xa0292f0 state 2
2013-02-18 01:07:55 [DEBUG]: ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:55 [DEBUG]: ike_sa.c:230:ikev2_sa_periodic_task(): child_sa: 0xa0292f0 state 2
2013-02-18 01:07:58 [DEBUG]: ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:58 [DEBUG]: ike_sa.c:230:ikev2_sa_periodic_task(): child_sa: 0xa0292f0 state 2
2013-02-18 01:08:00 [DEBUG]: isakmp.c:1752:isakmp_retransmit(): retransmit 0xa029218
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:328:sendfromto(): send packet to 192.168.2.10[500]
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:508:sendfromto(): 1 times of 468 bytes message will be sent to 192.168.2.10[500]
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:512:sendfromto():
f7b1ad69 396db4ca 00000000 00000000 21202208 00000000 000001d4 22000054
00000050 01010008 0300000c 0100000c 800e00c0 0300000c 0100000c 800e0080
03000008 01000003 03000008 02000001 03000008 02000002 03000008 02000004
03000008 03000002 00000008 0400000e 28000108 000e0000 6f3e65ea af4d1d39
96482cd8 26d54f84 61e6e303 c3b44645 d6136174 7ac5a5be a7aa129e 752ef1b1
2714229a 8b1a16d3 06655816 616d9d65 3007a32d da61bd83 f0bdfc70 c7a8065f
8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e 9c51e219 3e043768 75990287
9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb dd1e9a13 13d94c7e 523d013e
94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c 27e36099 a8bf85cd 95d50730
c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c e8c27d95 f88b9a1c d84b7562
c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340 380e16b4 3381dd53 0b063de7
e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7 9766f80c 29000024 8a955a65
45c55381 de1f2719 d7dd07e8 3f72387b 36c8901c 5a993376 e818ac07 2900001c
00004004 b21aadcd b80ecf64 eb75bb09 d9a325fd bdbc18fa 0000001c 00004005
2960ca7b 9506f1be 125f1113 6a34609b a02d7226
2013-02-18 01:08:01 [DEBUG]: ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1



[root@localhost ~]# ip xfrm state list
src 192.168.2.10 dst 192.168.2.9
proto esp spi 0x0e707d78 reqid 0 mode tunnel
replay-window 0
sel src 192.168.2.10/32 dst 192.168.2.9/32
src 192.168.2.9 dst 192.168.2.10
proto esp spi 0x00000000 reqid 0 mode tunnel
replay-window 0
sel src 192.168.2.9/32 dst 192.168.2.1/32 proto udp sport 34796 dport 1025



References

[3] Linux Documentation /usr/src/kernel/docs
[7] XFRM: BEET IPsec mode for Linux - http://lwn.net/Articles/144899/
[10] http://www.croz.net/eng/xfrm-programming/
[14] Removing IP Route Cache: http://vger.kernel.org/~davem/columbia2012.pdf
[15] OpenSSL. Openssl web page. http://www.openssl.org/.


-->