Contents
IPSec: NETKEY and Racoon2
Introduction
IPSec was
developed by IETF, as part of RFC 2401-2412 in 1998. IPSec creates a secure Control
Tunnel by using a handshake protocol called Internet Key Exchange (IKE), that
authenticates the end points of the tunnel to each other, and then IKE in turn
creates Data Tunnels that use symmetric encryption between the end points.
There are 2
parts of the IPSec protocol in an implementation: the IKEv2 Control protocol,
e.g. Racoon2 which resides in User space, and a Data Path, e.g. NETKEY that is in
the Kernel space, that uses the keys exchanged by the Control protocol. Both of
these parts are described in this article.
The racoon2
uses OpenSSL [15] library for its cryptographic operation.
The iproute2
package provides CLI commands “ip xfrm state” and “ip xfrm policy” to request
detailed information about the IPsec SAs and policies installed in the kernel. The
-s option displays additional information like the number of transmitted or
invalid messages. The setkey command from the ipsec-tools package also provides
similar information.
NETKEY
The NETKEY
Kernel stack is based on XFRM and “Stackable Destination”. An important structure in the IPSec processing
in the kernel is dst_entry.
The data
structure dst_entry is used to store the protocol-independent information of
cached routes. The IP route table rtable is a wrapper for dst_entry, and it
stores protocol dependent info and the result of a routing lookup. The sk_buff
struct points to the dst_entry structure.
dst_entry is
used to assign a destination to a packet, which could be an external host
machine or an internal packet handler. For external, the dst_entry structure includes
the interface to the neighbor struct, IPSec transformers, etc. A neighbour struct
represents a computer that is reachable via Layer-2.
<![if !supportLists]>·
<![endif]>dst->neightbour->hh_output() is invoked for
packets to a destination present in the layer-2
<![if !supportLists]>·
<![endif]>dst->neighbour->output() is used for network
devices without layer-2 header cache
If the ARP
entry is valid, then the pointers normally point to dev_queue_xmit(), else it
points to neigh_resolve_output()
In the
usual case, there is only one dst_entry for every skb. With IPSec, there is a
linked list of dst_entries, created using the “child” pointer, and only the
last one is for routing; all other dst_entries are for IPSec transformers ;
these other dst_entries have the DST_NOHASH flag set, to indicate that this
entry is not part of the routing cache. Each of the dst_entries has a “path”
pointer, that points to the last dst entry.
Some basic
objects used by the XFRM architecture are (Ref: include/net/xrm.h)
<![if !supportLists]>·
<![endif]>policy rule, struct xfrm_policy (SPD entry)
<![if !supportLists]>·
<![endif]>bundle of transformations (dst_entry), struct xfrm_dst
(SA bundle)
<![if !supportLists]>·
<![endif]>instance of a transformer, struct xfrm_state (SA)
<![if !supportLists]>·
<![endif]>template to clone xfrm_state, struct xfrm_tmpl
struct dst_entry
<![if !supportLists]>·
<![endif]>struct xfrm_state
*xfrm; // represents IPSec
Security Association (IPSec SA)
<![if !supportLists]>o <![endif]>struct
props – defines parameters for this SA (mode, aalgo, ealgo, calgo etc)
<![if !supportLists]>o <![endif]>struct dst_ops,
allows higher-layer protocols to run protocol-specific functions that
manipulate the entries. This provides the interface between the
protocol-independent cache and L3 protocols that use the routing cache.
<![if !supportLists]>§ <![endif]>*neigh_lookup()->output
<![if !supportLists]>o <![endif]>xfrm_mode –
defines xfrm_tunnel_mode, xfrm_transport_mode etc. xfrm_state_afinfo is a
struct added by a patch [6, to have a convenient to have a pointer from
xfrm_state to address-specific functions such as the output function for a
family. e..g esp_type gets registered during esp_init at this point.
<![if !supportLists]>§ <![endif]>struct
xfrm_state_afinfo *afinfo;
<![if !supportLists]>·
<![endif]>struct xfrm_type
*type_map[IPPROTO_MAX]
<![if !supportLists]>o <![endif]>xfrm_type –
defines ah_type, esp_type, ipcomp_type, ipip_type
<![if !supportLists]>·
<![endif]>(*input)(struct sk_buff *);
<![if !supportLists]>·
<![endif]>(*output)(struct sk_buff *);
<![if !supportLists]>·
<![endif]>struct dst_entry
*child;
<![if !supportLists]>·
<![endif]>struct net_device
*dev;
Routing Cache and dst_entry relationships [11]
All routes
are kept in the routing tables, called FIB (Forwarding Information Base). This
is built by Routing Daemons and CLIs from administrators. Implemented using
Trie data structure (LC-Trie). Routing is through many means, e.g. source
address, TOS, firewall marking, etc. The routing table holds routes based on
Destination address.
To enable
faster lookup, a route cache is maintained. One can see its contents by the CLI
command “ip -s route show cache”. The Hash Table used for Route Cache is keyed
on DA, SA, TOS, Firewall MARK and Interface pointer. This is created on demand
as pkts flow into and out of the router. The data structure used is Jenkins
Hash, with a random number as salt, that is recomputed every 10 min. [13, 14].
The route
cache has been removed from the latest 3.X kernels.
struct xfrm_dst
xfrm_dst is
a data structure that contains a bundle of transformations that are cached by
xfrm_policy structure [5]. A pointer to skb->refdst entry can be type-casted
to xfrm_dst, since the first struct in xfrm_dst is of type dst_entry.
<![if !supportLists]>·
<![endif]>struct dst_entry dst;
<![if !supportLists]>·
<![endif]>struct xfrm_policy *policy
<![if !supportLists]>o <![endif]>xfrm_tmpl
xfrm_vec[]
<![if !supportLists]>o <![endif]>u8 type,
action
<![if !supportLists]>·
<![endif]>int num_pols; num_xfrms
xfrm_policy
represents IPSec Policy. xfrm_dst “Transform Bundles” are cached at xrfm_policy
struct (field ->bundles). Stackable destination:
dst -. xfrm .-> xfrm_state #1
|---. child .-> dst -. xfrm .-> xfrm_state #2
|---. child .-> dst -. xfrm .-> xfrm_state #3
|---. child .-> NULL
The
following figure showing structure relationships is taken from [5].
High Level Data Structures Interactions.
Control Path
Data Path
Transform Control Interface (User/Kernel)
ipsec_pf_key_init and
xfrm_user_init modules are implemented as a Kernel Modules in
net/xfrm/xfrm_user.c and
net/key/af_key.c
and provide routines for send/recv vectors between User and
Kernel space.
XFRM Infrastructure Init
With IPSec enabled, the IP Route function ip_rt_init() calls
xfrm_init() and xfrm4_init() to initialize the policy functions that are called
in the output path.
net/xfrm/xfrm_policy.c/xfrm_init
|
register_pernet_subsys(&xfrm_net_ops)
|
||
xfrm_input_init()
|
registers xfrm_net_init/exit functions
|
||
xfrm_net_init()
|
xfrm_state_init()
|
||
xfrm_policy_init()
|
|||
xfrm_dst_ops_init
|
|||
xfrm_sysctl_init
|
|||
xfrm_input_init()
|
Creates a L1 cache called secpath_cache, that holds xfrm_states.
|
||
net/ipv4/xfrm4_policy.c/xfrm4_init
|
dst_entries_init(&xfrm_dst_ops)
|
||
xfrm4_state_init()
|
|||
xfrm4_policy_init()
|
|||
register_net_sysctl – net/ipv4
|
Register /proc variables
|
IPSec Output Processing
Some of the following notes are taken from [4]
net/ipv4/ip_output.c/ip_queue_xmit() is called from upper
layer protocol like TCP.
ip_route_output_flow()
called by ip_route_output_ports() is the central route lookup function which is
responsible for route lookup, and creating the stackable destination in dst.
<![if !supportLists]>·
<![endif]>This function calls net/xfrm/xfrm_policy.c/xfrm_lookup()
to access the SPD via xfrm_sk_policy_lookup(). This calls xfrm_resolve_and_create_bundle.
<![if !supportLists]>·
<![endif]>xfrm_lookup (net/xfrm/xfrm_policy.c) finds a match in
the SPD and then checks the action in the xfrm_policy entry. If the action is
XFRM_POLICY_BLOCK, it returns an error. If the action is XFRM_POLICY_ALLOW,
then a "bundle" of destinations is accumulated and returned.
Set the skb->refdst to a
dst_entry
ip_localout() is finally called
after putting IP headers, which in turn calls dst_output()
This calls all transfer routines
via output routines pointing to esp6_output() etc, and uses the xfrm_state
which is also a pointer in the dst struct.
The last dst_entry of the
destination stack is the final routing entry, with it's output function pointer
(->output) set to ip_output The other dst_entry's on top (up to nx
transformations, see the for-loop in __xfrm4_bundle_create) have set
xfrm4_output, or xfrm6_output etc.
ip_output() finally calls
NF_HOOK_COND, with NF_INET_POST_ROUTING
The return
values from the functions are as follows:
<![if !supportLists]>·
<![endif]>xfrm4_output calls the appropriate transformation
function (e.g. esp_output) esp_output returns 0 on success, else an error
occurred.
<![if !supportLists]>·
<![endif]>xfrm4_output itself returns NET_XMIT_BYPASS in case of
no error. If xfrm4_output would return 0, the for-loop in dst_ouput is left
immediately with a return statement, after the output function was called.
<![if !supportLists]>·
<![endif]>So NET_XMIT_BYPASS keeps the destination stack running
until the final call to ip_output makes it return to ip_queue_xmit, which
returns with the result. The following diagrams are taken from [2].
dst_output()
is called from xfrm_output_one, which is called from proto->output routines,
e.g. esp_output().
dst = skb_dst_pop(skb);
if dst {
x->outer_mode->output(x, skb);
err = x->repl->overflow(x, skb);
skb_dst_force(skb);
err = x->type->output(x, skb);
skb_dst_set(skb, dst);
} while (dst->xfrm)
IPSec Input Processing
Iked
receives a IKE_SA_INIT message. (notes from [2])
It searches
remote by peer’s IP address and replies IKE_SA_INIT using algorithm in the
remote.
It receives
IKE_AUTH from the peer.
It validates the peer by information in the
remote
It searches
the selector by the Traffic Selector payload in the message.
It finds
the selector and retrieves policy, ipsec and sa.
It processes the request and replies IKE_AUTH
Details: The
whole de-encapsulation is done in xfrm4_rcv_encap, except in case of
nat-traversal, where udp_rcv comes into play. dst_input is not used for xfrm
transformations. The main function is
ip_local_deliver_finish.
Register
IPSec Headers in IPv6 with inet6_protos[], ie. upper layer protocol handler
array.
ESP Module
net/ipv4/esp4.c/esp4_init()
gets called as part of Module init. The init function calls
xfrm_register_type(&esp_type)
and inet_add_protocol(&esp4_protocol)
<![if !supportLists]>·
<![endif]>esp_type = { esp_init_state, esp_destroy, esp_input,
esp_output….}
<![if !supportLists]>·
<![endif]>esp4_protocol = { xfrm4_rcv, esp4_err }
xfrm_rcv
ultimately calls xfrm_input
The packet
enters as ESP or AH and thus the protocol handler is xfrm4_rcv which is called
via ipprot->handler(skb).
xfrm4_rcv
calls xfrm4_rcv_spi which calls xfrm_rcv_encap, finally calling xfrm_input(),
which does all de-encapsulating transformations in a do-while-loop, calling
x->type-input. xfrm_state{} pointer is kept in sec_path{} in sk_buff{}
After processing
IPsec, the kernel call xfrm_policy_check() at entrance of upper layer process. xfrm_state
pointer is kept in sec_path{}. When all transformation processing is done the
ip paket is put back into the CPU specific packet queue by netif-rx.
NAT Traversal (with UDP)
INPUT:
- udp_rcv
|
--- udp_queue_rcv_skb
|
--- udp_encap_rcv (strips of udp)
|
--- xfrm4_rcv_encap (usual procedure)
OUTPUT: - Done in esp_output as part of a stacked destination (e.g. net/ipv4/esp4.c)
- udp_rcv
|
--- udp_queue_rcv_skb
|
--- udp_encap_rcv (strips of udp)
|
--- xfrm4_rcv_encap (usual procedure)
OUTPUT: - Done in esp_output as part of a stacked destination (e.g. net/ipv4/esp4.c)
RACOON2
The
following code summary for ikev2 implementation in Linux is from the Racoon2
project [1]. Racoon2, is a successor of Racoon, which was developed by the KAME
project. It supports IKEv1, IKEv2, and KINK protocols. It works on FreeBSD,
NetBSD, Linux, and Mac OS X. Racoon2 is provided under a BSD-style license. The
control protocol of IPSec used in racoon2 is IKEv2 and is explained in the RFC
5996. [9]
Racoon2 implements IKEv2, and it’s aim is to establish and
maintain shared security parameters and authenticated keys between two IPSec
end points.
The main()
iked/main.c/main()
|
parse options
|
||
iked/crypto_openssl.c/eay_init()
|
ERR_load_crypto_strings()
|
||
OpenSSL_add_all_algorithms()
|
|||
ENGINE_load_built_engines()
|
|||
ENGINE_register_all_complete()
|
|||
lib/cfsetiup.c/rcf_read()
|
reads config file and creates data structures
|
||
sched_init()
|
|||
iked/ike_pfkey.c /sadb_init()
|
Initialize sadb_request_list_head
|
||
get pfkey_socket value and set ike_rcpfk_callback() pointer with call to
lib/if_pfkeyv2.c/rcpfk_init()
|
open PF_KEY i/f with rcpfk_open()
|
||
Register with kernel, and set cb = ike_rcpfk_callback[]
structure that has pointers to sadb_xxx_callback routines.
|
|||
iked/ike_conf.c/ike_conf_check_consistency()
|
Data Structure checks:
Check rcf_default_head
|
||
Check rcf_remote_head
|
|||
Check Selector section
|
|||
iked/ike_spmif.c/ike_spmif_init()
|
spmif_socket = spmif_init()
|
lib/if_spmd.c/open_spmif_local()
fd = socket(PF_UNX, SOCK_STREAM, 0)
job_initqueue(&spmifh)
login_spmif(fd)
|
|
isakmp_init()
|
oakley_dhinit()
|
init DH parameters
|
|
ikev2_init()
|
ikev2_sa_init()
|
||
ikev2_cookie_init()
|
|||
ikev2_periodic_task() – cookie_refresh()
|
|||
isakmp_open()
|
open sockets to dst address
|
||
iked_pidfile_create()
|
|||
iked_mainloop()
|
|||
Interface
to SPMIF is using Unix Domain Sockets. A structure of type spmifh is created,
and initialized as follows:
struct spmif_job {
enum job_type {
JOB_CANCELED, JOB_POLICY_ADD, JOB_POLICY_DELETE,
JOB_FQDN_QUERY, JOB_SLID, JOB_MIGRATE
}
type;
union {
int (*generic)();
int (*policy_add)(void *, int);
int (*policy_delete)(void *, int);
int (*fqdn_query)(void *, const char *);
int (*slid)(void *, const char *);
int (*migrate)(void *, int);
}
callback;
void *tag;
int fd;
char buf[200];
/* job queue */
struct spmif_job *next;
};
struct spmif_handle {
struct linereader *lr;
struct spmif_job *job_head;
struct spmif_job **job_tailp;
};
spmif_handle->job_head
= NULL
spmif_handle->job_tailp
= & spmif_handle->job_head
Ikev2 Main Loop
This is the main loop for the ikev2 daemon. It waits on
“select” and executes the FSM based on packets coming in from
<![if !supportLists]>·
<![endif]>SPMD,
<![if !supportLists]>·
<![endif]>PF_KEY and
<![if !supportLists]>·
<![endif]>Network.
iked/main.c/iked_mainloop()
|
|||
FD_ZERO(&fdset)
|
|||
isakmp_fdset
|
|||
Add spmif_fd to FDSET
spmif_fd = ike_spmif_socket()
|
|||
select()
|
|||
check data on sadb_socket, if true, call sadb_poll()
|
iked/ikepfkey.c/rcpfk_handler()
|
Recv PFKEY meg, using rcpfk_recv()
Call handler – rcpfk_msg[sadb_msg_type].recvfunc()
e.g. for SADB_ACQUIRE msg from kernel, it calls
rcpfk_recv_acquire()
|
|
check data on spmif_fd, if true, call iked/ike_spmif.c/ike_spmif_poll()
|
spmif_handler(spmif_socket)
|
Call handlers that execute job, such as, JOB_POLCIY_ADD/DELETE,
JOB_FQDN_QUERY etc.
|
|
For all other FDs
|
isakmp_handler(isakmp_sock)
|
This is the only places of entry, ie. From
isakmp_handler()
Process ikev2 pkts, by calling ikev2_input()
|
The following sections provide detail on the 3 main handlers
for ikev2 dameon, i.e.
<![if !supportLists]>1)
<![endif]>sadb_poll
<![if !supportLists]>2)
<![endif]>ike_spmif_poll
<![if !supportLists]>3)
<![endif]>isakmp_handler
sadb_poll()
lib/if_pfkeyv2.c/sadb_poll() - Has callbacks for each
message from Kernel on PF_KEY socket. This calls functions in rcpfk_msg[] first
and then this in turn calls functions in ike_rcpfk_callback[].
static struct pfkey_msgtype {
char *name;
int (*recvfunc) (caddr_t *, struct rcpfk_msg *);
} rcpfk_msg[] = {
{
"GETSPI", rcpfk_recv_getspi, },
{
"UPDATE", rcpfk_recv_update, },
{
"ADD", rcpfk_recv_add, },
{
"DELETE", rcpfk_recv_delete, },
{
"GET", rcpfk_recv_get, },
{
"ACQUIRE", rcpfk_recv_acquire, },
{
"REGISTER", rcpfk_recv_register, },
{
"EXPIRE", rcpfk_recv_expire, },
{
"X_SPDUPDATE", rcpfk_recv_spdupdate, },
{
"X_SPDADD", rcpfk_recv_spdadd, },
{
"X_SPDDELETE", rcpfk_recv_spddelete, },
{
"X_SPDGET", rcpfk_recv_spdget, },
{
"X_SPDDUMP", rcpfk_recv_spddump, },
{
"X_SPDEXPIRE", rcpfk_recv_spdexpire, },
{
"X_SPDDELETE2", rcpfk_recv_spddelete2, },
{ "X_MIGRATE", rcpfk_recv_migrate,
}
}
Iked/ike_pfkey.c
static struct rcpfk_cb ike_rcpfk_callback
= {
sadb_getspi_callback,
sadb_update_callback,
sadb_expire_callback,
sadb_acquire_callback,
sadb_delete_callback,
sadb_get_callback,
};
ike_spmif_poll()
iked/ike_spmif.c/spmif_poll() has the following switch()
statements.
case JOB_POLICY_ADD: parserep_policy_add(job,
lr->lines,
nline);
case JOB_POLICY_DELETE: parserep_policy_delete(job,
lr->lines,
nline);
case JOB_FQDN_QUERY: parserep_fqdn_query(job,
lr->lines,
nline);
case JOB_SLID: parserep_slid(job,
lr->lines,
nline);
case JOB_MIGRATE: parserep_migrate(job,
lr->lines,
nline);
isakmp_handler()
iked/iksamp.c/isakmp_handler() calls iked/ikev2.c/ikev2_input()
that implements the Ikev2 State Machine
IKEV2INPUT ikev2_input_dispatch[] = {
responder_state0_recv, /* Responder idling state */
initiator_ike_sa_init_recv, /* Initiator IKE_SA_INIT sent */
responder_ike_sa_auth_recv0, /* Responder IKE_SA_INIT sent */
initiator_ike_sa_auth_recv0, /* Initiator IKE_SA_AUTH sent */
responder_ike_sa_auth_recv, /* Responder IKE_SA_AUTH received */
initiator_ike_sa_auth_recv, /* Initiator IKE_SA_AUTH received */
ikev2_established_recv, /* should be CREATE_CHILD_SA or
INFORMATIONAL */
ikev2_dying_recv, /* same as established, except no
initiating */
ikev2_dead_recv,
};
iked/ikev2_impl.h
enum ikev2_state {
IKEV2_STATE_IDLING = 0,
IKEV2_STATE_INI_IKE_SA_INIT_SENT = 1,
IKEV2_STATE_RES_IKE_SA_INIT_SENT = 2,
IKEV2_STATE_INI_IKE_AUTH_SENT = 3,
IKEV2_STATE_RES_IKE_AUTH_RCVD = 4,
IKEV2_STATE_INI_IKE_AUTH_RCVD = 5,
IKEV2_STATE_ESTABLISHED = 6,
IKEV2_STATE_DYING = 7,
IKEV2_STATE_DEAD = 8
/*
IKEV2_STATE_EAP = 9 */
/*
IKEV2_STATE_ESTABLISHED_WAIT_INITIATOR, */
};
enum ikev2_child_state {
IKEV2_CHILD_STATE_IDLING = 0,
IKEV2_CHILD_STATE_GETSPI,
IKEV2_CHILD_STATE_GETSPI_DONE,
IKEV2_CHILD_STATE_WAIT_RESPONSE,
IKEV2_CHILD_STATE_MATURE,
IKEV2_CHILD_STATE_EXPIRED, /* XXX STATE_DONE */
IKEV2_CHILD_STATE_REQUEST_PENDING,
IKEV2_CHILD_STATE_REQUEST_SENT,
IKEV2_CHILD_STATE_NUM,
IKEV2_CHILD_STATE_INVALID /* to indicate invalid state */
};
<![if !supportMisalignedColumns]>
<![endif]>
iked/ikev2.c/ikev2_input()
|
||||
ikev2_check_payloads()
|
||||
get message_id from packet
|
||||
ikev2_find_sa()
|
||||
if ike_sa NOT found
|
ikev2_conf_find(remote)
|
|||
if no config, use default remote config
|
||||
create ikev2 as responder
|
||||
ikev2_create_sa
|
||||
call pkt processing routine
|
(*ikev2_input_dispatch[ike_sa->state](ike_sa, pkt, remote, local)
|
|||
Configuration
All configurations for racoon2 daemons are in raccon2.conf
file.
The following picture shows the relativeness
between each directive. The following diagram is taken from [3].
+---(selector_index)--- remote
| ^
| |
| (remote_index) +-(sa_index)-> sa
v | |
selector -+ | +-(ipsec_index)-> ipsec -+-(sa_index)-> sa
| | |
selector -+-(policy_index)-> policy
-+-(ipsec_index)-> ipsec
---(sa_index)-> sa
| |
selector -+ +-(ipsec_index)-> ipsec ...
struct rcf_selector {
rc_vchar_t
*sl_index;
int order; rc_type
direction;
struct rc_addrlist
*src, *dst ;
int upper_layer_protocol;
int
next_header_including;
rc_vchar_t
*tagged;
int reqid;
struct rcf_policy
*pl;
struct
rcf_selector *next;
};
struct rcf_default {
struct rcf_remote
*remote;
struct rcf_policy
*policy;
struct rcf_ipsec
*ipsec;
struct rcf_sa *sa;
};
SPMD manages the SPD. IKED interacts with SAD. SPMD reads
the configuration file for selector, policy and ipsec parameters and installs entries
into SPD using PF_KEY. The kernel returns a SPID (IP Sec Policy ID) to the SPMD,
which then waits for queries from IKED. IKED waits for PF_KEY SADB_ACQUIRE msg
or UDP SA_INIT message from the peer. SADB_ACQUIRE message contains SPID. IKED
requests SPMD using the SPID to get the Selector ID, and then using this ID, retrieves
other parameters. IKED can also add a policy in the SPD in Kernel, via
spmif_post_policy_add() via the SPMD.
Basic configuration
for Tunnel Mode
cd /usr/local/racoon2/etc/racoon2
# cp vals.conf.sample vals.conf
Change Local and Remote IP addersses for Transport/Tunnel mode
Change Local and Remote IP addersses for Transport/Tunnel mode
Generate pre-shared key using pskgen
# cd /usr/local/racoon2/etc/racoon2/psk
# pskgen -r -o test.psk
# od test.psk
0000000 106307 153425 117435 137575 023026 044255 150547 166447
0000020
# cp racoon2.conf.sample racoon2.conf
Uncomment the Tunnel mode IKEv2 or IKEv1 (initiator and responder)
include "/usr/local/racoon2/etc/racoon2/tunnel_ike.conf";
# cp default.conf.sample default.conf
# cp tunnel_ike.conf.sample tunnel_ike.conf
Start spmd and iked
/usr/local/racoon2/etc/racoon2/init.d/spmd start
/usr/local/racoon2/etc/racoon2/init.d/iked -ddd -F
1st Pkt and SADB_ACQUIRE from NETKEY to Racoon2
From [2]
The kernel
sends a SADB_ACQUIRE message including IPsec policy ID to the key exchange
daemons via PF_KEY socket.
Iked
receives the message and get IPsec policy ID in the sadb_x_policy_id field.
Iked
requests the identifier of selector corresponding to the IPsec policy ID to
spmd.
Iked
receives the selector identifier from spmd.
Iked
searches selector by the identifier and retrieves policy, remote, ipsec, sa.
Iked
validates the key exchange protocol in the remote.
Iked
processes the SADB_ACQUIRE.
In more
detail:
There is no
entry in the SAD
Kernel
broadcasts SADB_ACQUIRE to the PF_SOCKET listeners in user space
[ Note:
Ikev2 via sadb_init() in the main function had opened a PF_KEY socket i/f with
rcpfk_open(), and registered RCT_SATYPE_AH/ESP/IPCOMP with the kernel and set
up callback routines in “cb” variable to sadb_xxx_callback() routines.
These
routines are called, from rcpfk_msg[sadb_msg_type].recvfunc() functions when
the ikev2 daemon receives a SADB_msg from the kernel ]
Data on
PF_KEY socket wakes up “select” which in turn calls sadb_poll() to get the
packet from PF_KEY via rcpfk_recv().
Routine
rcpfk_msg[sadb_msg_type].recvfunc() is called, which happens to be rcpfk_recv_acquire()
function to handle SADB_ACQUIRE message from the kernel, which in turn calls
the function from the “cb”, called sadb_acquire_callback().
sadb_acquire_callback()
calls iked/isakmp.c/isakmp_initiate(), which initiates the negotiation. If it
fails it calls cb->acquire_error() with erro code.
isakmp_initiate() is
called with sadb_initiator_request_method , which is initialized as follows.
What ikev2 daemon
needs now is the selctor_index, so that
it can make a SA_INIT packet
For this,
isakmp_initiate() calls raccoon_malloc(), and sets the callback method as sadb_initiator_request_method.
It then
calls iked/ike_spmif.c/ike_spmif_post_slid(), which calls
lib/if_spmd.c/spmif_post_slid() as follows:
job->callback.slid
= isakmp_initiate_cont()
job->callback_req=
job->tag= * isakmp_acquire_request
isakmp_acquire_request->callback_method
= sadb_initiator_request_method
/*
sadb_initiator_request_method used in response to SADB_ACQUIRE */
struct sadb_request_method
sadb_initiator_request_method = {
sadb_getspi,
sadb_acquire_error,
sadb_update,
sadb_add,
sadb_delete,
sadb_get,
};
job->fd=Unix
Socket
job->type=JOB_SLID
spmif_post_slid()
posts a of type JOB_SLID to the Unix domain socket
The return
value is sent over the same Unix Domain socket back to iked_mainloop(), where
the select wakes up on ike_spmif_poll() and calls spmif_handler(). This finally
calls lib/if_spmd.c/parserep_slid(), where the slid is taken and callback
function is called with slid, i.e. iskamp_initiate_cont(). With the selctor
pointer, we get the policy and remote structures and finally call ikev2_initiate(req,
policy, selector, rm_info) to trigger the FSM.
Backup
BEET [7]
+Sending
(inner IPv4, outer IPv4)(4-4)
+=====================================
+inet_sendmsg
+ raw_sendmsg
+ ip_route_output_flow
+ __ip_route_output_key
+ xfrm_lookup
+ flow_cache_lookup
+ xfrm_policy_lookup // lookup IPsec
policy
+ xfrm_find_bundle // lookup IPsec SA
+ __xfrm_selector_match
+ xfrm_tmpl_resolve // only if bundle was not found!
+ xfrm_state_find
+ xfrm_bundle_create // create output (dst)
chain if bundle was not
found
+ __xfrm4_bundle_create
+ ip_push_pending_frames
+ dst_output(skb) //this calls skb->dst->output();
+ xfrm4_output //This finally returns 4 (NET_XMIT_BYPASS) to
dst_output();
+ xfrm4_encap
+ esp_output
+ xfrm_beet_output //change the ip header to outer.
+ dst_output(skb)
+ ip_output
+ ip_finish_output Or ip_fragment //depending
on size of packet
+ // Returns 0 to dst_output(); which
makes dst_output to come
out
of infinite loop.
+ dev_queue_xmit
+
+
+Receiving
(inner IPv4, outer IPv4)(4-4)
+===========
+
+net_rx_action()
+e1000_clean() // dependent on network hardware
+e1000_clean_rx_irq()
+netif_receive_skb()
+ deliver_skb()
+ ret = pt_prev->func(skb, skb->dev,
pt_prev);
+ ip_rcv()
+ nf_hook()
+ ip_rcv_finish()
+ ip_route_input()
+ dst_input()->ip_forward() or
ip_input()
+ ip_input // remove the IPv4 header
+ ip_input_finish
+ ret = ipprot->handler(&skb,
&nhoff);
+ xfrm4_rcv()
+ xfrm4_rcv_encap()
+ xfrm4_parse_spi()
+ xfrm_state_lookup() // lookup
IPsec SA
+
xfrm_beet_input(skb, x) //To change to inner IP header.
+ nexthdr = x->type->input(x,
xfrm.decap, skb) // ==
esp_input
+ esp_input() // process ESP based on inner
address
+ returns 0 ;
+ /* beet handling in xfrm_rcv_spi */
+ netif_rx()
+ // ip_input_finish returns 0
+ // netif_receive_skb returns 0
+netif_receive_skb //Now we have an IPv4 packet. So the input
flow is
for
v4 packet.
+ deliver_skb()
+ ret = pt_prev->func(skb, skb->dev,
pt_prev);
+ ip_rcv()
+ nf_hook() //This
calls ip_rcv_finish(skb)
+ ip_rcv_finish() //Here the skb->dst is NULL and so is filled for
the
input side.
+ ip6_route_input()
+ dst_input()->ip_forward() or
ip_input()
+ ip_input // remove the IPv4 header
+ ip_input_finish
+ ...
+ ...
+ ...
+
Output Pkt ping from IPSec Tunnel Node (showing debugs till SADB_ACQUIRE)
[root@localhost racoon2]# more vals.conf
### Tunnel Mode Settings ###
# Your Network
Address or Host Address (host-to-host tunnel mode)
MY_NET "0.0.0.0/0";
# Peer's Network
Address or Host Address (host-to-host tunnel mode)
PEERS_NET "0.0.0.0/0";
# Your SGW Address
MY_GWADDRESS "192.168.2.9";
# Peer's SGW
Address
# You don't need
to specify if you're IKE responder
# talking to an
IKE initiator behind NAT.
PEERS_GWADDRESS
"192.168.2.10";
[root@localhost racoon2]# more racoon2.conf
interface
{
ike {
192.168.2.9 port 500;
Computer-1 (IP address 192.168.2.9)
#ping 192.168.2.1
[root@localhost racoon2]# ip xfrm policy list
src 0.0.0.0/0 dst 0.0.0.0/0
dir 4 priority 0
ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
dir 3 priority 0
ptype main
src 0.0.0.0/0 dst 0.0.0.0/0
dir fwd priority 0
ptype main
tmpl src
192.168.2.10 dst 192.168.2.9
proto esp
reqid 0 mode tunnel
src 0.0.0.0/0 dst 0.0.0.0/0
dir in priority 0 ptype main
tmpl src
192.168.2.10 dst 192.168.2.9
proto esp
reqid 0 mode tunnel
src 0.0.0.0/0 dst 0.0.0.0/0
dir out priority 0
ptype main
tmpl src
192.168.2.9 dst 192.168.2.10
proto esp
reqid 0 mode tunnel
[root@localhost init.d]# iked -ddd -F
2013-02-18 01:07:43 [INFO]: main.c:433:main(): starting iked for
racoon2 20100526a
2013-02-18 01:07:49 [DEBUG]:
ike_pfkey.c:621:sadb_acquire_callback(): sadb_acquire_callback: seq=11 satype=96
sa_src=192.168.2.9[0] sa_dst=192.168.2.10[0] samode=82 selid=1921
2013-02-18 01:07:49 [DEBUG]: if_spmd.c:829: SLID ok: 250
ike_tun_sel_out
2013-02-18 01:07:49 [DEBUG]: ikev2.c:756:ikev2_initiate():
creating new ike_sa
2013-02-18 01:07:49 [DEBUG]: ike_sa.c:412:ikev2_allocate_sa():
ikev2_create_sa((nil), 192.168.2.9[500], 192.168.2.10[500], 0xa028660)
2013-02-18 01:07:49 [DEBUG]: ike_sa.c:415:ikev2_allocate_sa():
sa: 0xa029158
2013-02-18 01:07:49 [DEBUG]: ikev2.c:798:ikev2_initiate():
child_sa: 0xa0292f0
2013-02-18 01:07:49 [DEBUG]:
ikev2_child.c:139:ikev2_child_state_set(): child_sa 0xa0292f0 state IDLING
-> GETSPI
2013-02-18 01:07:49 [DEBUG]: ike_pfkey.c:269:sadb_getspi():
sadb_getspi: seq=11, satype=96
2013-02-18 01:07:49 [DEBUG]: ike_pfkey.c:459:sadb_getspi_callback():
sadb_getspi_callback: seq=11, spi=0x0e707d78, satype=96,
sa_src=192.168.2.10[0], sa_dst=192.168.2.9[0]
2013-02-18 01:07:49 [DEBUG]:
ikev2_child.c:139:ikev2_child_state_set(): child_sa 0xa0292f0 state GETSPI
-> GETSPI_DONE
2013-02-18 01:07:49 [DEBUG]:
ikev2_proposal.c:564:ikev2_pack_proposal_sub(): ikev2_pack_proposal_sub:
2013-02-18 01:07:49 [DEBUG]:
ikev2_proposal.c:572:ikev2_pack_proposal_sub(): proposal #1:
2013-02-18 01:07:49 [DEBUG]:
ikev2_proposal.c:564:ikev2_pack_proposal_sub(): ikev2_pack_proposal_sub:
2013-02-18 01:07:49 [DEBUG]:
ikev2_proposal.c:572:ikev2_pack_proposal_sub(): proposal #1:
2013-02-18 01:07:49 [DEBUG]:
ikev2_proposal.c:587:ikev2_pack_proposal_sub(): protocol 1 spi_size 0
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
compute DH's private.
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
7a5cd06a 01571ddd 9a95360e 7148fbf2 6807b1bf 43be3764 d347ef05
dccf87f3
60b18f22 4c37fb8a e273ed35 09e5c283 5c7a9792 1fff3eb4 2bceb986
ab478a33
d903cba5 3e0e9fec fd858ff8 ffeee4eb 3b4192e7 2f76b797 4f2be931
d731efd4
6071745d 2a80ccb4 a8c66aa1 76512e85 b7ef4594 a9f6bec9 bc302d36
f7c61261
bc9deaa6 50b6b2d6 f4f9513a 63428d61 d4faed4a e391a162 59e1a61f
5a95e487
03a9f725 b1ce18c6 42fd3492 d713675b 368dd958 ec07f64d 91e8e3f3
93184343
99d46a8c dcf05495 28431137 1845e0e4 8d5ccef6 cb585263 e923ab3d
96fd6e93
2caa4007 f7a9256e f3bf8776 bd380cff e112f784 ba05ec71 21119be0
10d467aa
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
compute DH's public.
2013-02-18 01:07:50 [DEBUG]: dh.c:227:oakley_dh_generate():
6f3e65ea af4d1d39 96482cd8 26d54f84 61e6e303 c3b44645 d6136174
7ac5a5be
a7aa129e 752ef1b1 2714229a 8b1a16d3 06655816 616d9d65 3007a32d
da61bd83
f0bdfc70 c7a8065f 8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e
9c51e219
3e043768 75990287 9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb
dd1e9a13
13d94c7e 523d013e 94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c
27e36099
a8bf85cd 95d50730 c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c
e8c27d95
f88b9a1c d84b7562 c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340
380e16b4
3381dd53 0b063de7 e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7
9766f80c
2013-02-18 01:07:50 [DEBUG]:
ikev2_payload.c:719:ikev2_notify_payload(): ikev2_notify_payload(0, (nil), 0,
16388, 0xa0281e8, 20)
2013-02-18 01:07:50 [DEBUG]:
ikev2_payload.c:719:ikev2_notify_payload(): ikev2_notify_payload(0, (nil), 0,
16389, 0xa027fe0, 20)
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:162:ikev2_packet_construct(): ikev2_packet_construct(34, 0x8,
0x0, 0xa029158, [0xa0240e8, 5])
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:170:ikev2_packet_construct(): payload 0 type 33 (SA) data
0xa023e48 len 80
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:170:ikev2_packet_construct(): payload 1 type 34 (KE) data
0xa0244b8 len 260
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:170:ikev2_packet_construct(): payload 2 type 40 (NONCE) data
0xa023fb8 len 32
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:170:ikev2_packet_construct(): payload 3 type 41 (NOTIFY) data
0xa0247a0 len 24
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:170:ikev2_packet_construct(): payload 4 type 41 (NOTIFY) data
0xa029128 len 24
2013-02-18 01:07:50 [DEBUG]:
ikev2_packet.c:295:ikev2_packet_construct(): result 0xa024410
2013-02-18 01:07:50 [DEBUG]: ikev2.c:562:ikev2_transmit():
ikev2_transmit(0xa029158, 0xa024410) len 468
2013-02-18 01:07:50 [DEBUG]:
isakmp.c:1678:isakmp_transmit_noretry(): transmit 0xa029218
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:324:sendfromto():
sockname 192.168.2.9[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:326:sendfromto(): send
packet from 192.168.2.9[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:328:sendfromto(): send
packet to 192.168.2.10[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:508:sendfromto(): 1
times of 468 bytes message will be sent to 192.168.2.10[500]
2013-02-18 01:07:50 [DEBUG]: sockmisc.c:512:sendfromto():
f7b1ad69 396db4ca 00000000 00000000 21202208 00000000 000001d4
22000054
00000050 01010008 0300000c 0100000c 800e00c0 0300000c 0100000c
800e0080
03000008 01000003 03000008 02000001 03000008 02000002 03000008
02000004
03000008 03000002 00000008 0400000e 28000108 000e0000 6f3e65ea
af4d1d39
96482cd8 26d54f84 61e6e303 c3b44645 d6136174 7ac5a5be a7aa129e
752ef1b1
2714229a 8b1a16d3 06655816 616d9d65 3007a32d da61bd83 f0bdfc70
c7a8065f
8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e 9c51e219 3e043768
75990287
9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb dd1e9a13 13d94c7e
523d013e
94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c 27e36099 a8bf85cd
95d50730
c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c e8c27d95 f88b9a1c
d84b7562
c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340 380e16b4 3381dd53
0b063de7
e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7 9766f80c 29000024
8a955a65
45c55381 de1f2719 d7dd07e8 3f72387b 36c8901c 5a993376 e818ac07
2900001c
00004004 b21aadcd b80ecf64 eb75bb09 d9a325fd bdbc18fa 0000001c
00004005
2960ca7b 9506f1be 125f1113 6a34609b a02d7226
2013-02-18 01:07:50 [DEBUG]: isakmp.c:1656:isakmp_transmit():
sched 0xa023ff0
2013-02-18 01:07:52 [DEBUG]:
ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:52 [DEBUG]:
ike_sa.c:230:ikev2_sa_periodic_task(): child_sa: 0xa0292f0 state 2
2013-02-18 01:07:55 [DEBUG]:
ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:55 [DEBUG]: ike_sa.c:230:ikev2_sa_periodic_task():
child_sa: 0xa0292f0 state 2
2013-02-18 01:07:58 [DEBUG]:
ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
2013-02-18 01:07:58 [DEBUG]:
ike_sa.c:230:ikev2_sa_periodic_task(): child_sa: 0xa0292f0 state 2
2013-02-18 01:08:00 [DEBUG]: isakmp.c:1752:isakmp_retransmit():
retransmit 0xa029218
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:328:sendfromto(): send
packet to 192.168.2.10[500]
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:508:sendfromto(): 1
times of 468 bytes message will be sent to 192.168.2.10[500]
2013-02-18 01:08:00 [DEBUG]: sockmisc.c:512:sendfromto():
f7b1ad69 396db4ca 00000000 00000000 21202208 00000000 000001d4
22000054
00000050 01010008 0300000c 0100000c 800e00c0 0300000c 0100000c
800e0080
03000008 01000003 03000008 02000001 03000008 02000002 03000008
02000004
03000008 03000002 00000008 0400000e 28000108 000e0000 6f3e65ea
af4d1d39
96482cd8 26d54f84 61e6e303 c3b44645 d6136174 7ac5a5be a7aa129e
752ef1b1
2714229a 8b1a16d3 06655816 616d9d65 3007a32d da61bd83 f0bdfc70 c7a8065f
8e2bd1f0 dda24b70 d86546c4 b5445f87 c16a2d7e 9c51e219 3e043768
75990287
9767232b 9d9a2ca3 a34c8fdf 45497ce9 e55202eb dd1e9a13 13d94c7e
523d013e
94143312 8b85b6d9 79e610bb 94cb45b5 16b08b8c 27e36099 a8bf85cd
95d50730
c6b0b32f 77cd5ae2 55f65869 833c5cfb e44cd06c e8c27d95 f88b9a1c
d84b7562
c9c0fdbc 7b764d13 f17aa829 328e1089 6bf85340 380e16b4 3381dd53
0b063de7
e1f75936 e226b291 48b57d4b 0c72b742 2dd7aca7 9766f80c 29000024
8a955a65
45c55381 de1f2719 d7dd07e8 3f72387b 36c8901c 5a993376 e818ac07
2900001c
00004004 b21aadcd b80ecf64 eb75bb09 d9a325fd bdbc18fa 0000001c
00004005
2960ca7b 9506f1be 125f1113 6a34609b a02d7226
2013-02-18 01:08:01 [DEBUG]:
ike_sa.c:225:ikev2_sa_periodic_task(): ike_sa: 0xa029158 state 1
[root@localhost ~]# ip xfrm state list
src 192.168.2.10 dst 192.168.2.9
proto esp spi
0x0e707d78 reqid 0 mode tunnel
replay-window 0
sel src
192.168.2.10/32 dst 192.168.2.9/32
src 192.168.2.9 dst 192.168.2.10
proto esp spi
0x00000000 reqid 0 mode tunnel
replay-window 0
sel src
192.168.2.9/32 dst 192.168.2.1/32 proto udp sport 34796 dport 1025
References
[2]
USAGI IPv6 IPsec Development for Linux http://hiroshi1.hongo.wide.ad.jp/hiroshi/papers/SAINT2004_kanda-ipsec.pdf
[3]
Linux Documentation /usr/src/kernel/docs
[7]
XFRM: BEET IPsec mode for Linux - http://lwn.net/Articles/144899/
[10] http://www.croz.net/eng/xfrm-programming/
[13]
Routing Cache Removal: http://vincent.bernat.im/en/blog/2011-ipv4-route-cache-linux.html
[14]
Removing IP Route Cache: http://vger.kernel.org/~davem/columbia2012.pdf
[15]
OpenSSL. Openssl web page. http://www.openssl.org/.