Fun with IPsec stateful failover

By stretch | Monday, August 17, 2009 at 2:00 a.m. UTC

One way to provide failover for IPsec tunnels is to simply configure two independent tunnels between two sites. While simple, this approach means maintaining twice the configuration and consuming twice the address space. Cisco IOS offers an alternative approach using a feature known as stateful IPsec failover to terminate an IPsec tunnel on multiple devices at one or both ends for failover.

Consider the following topology of a branch site connected to a corporate headquarters:

topology.png

The branch pictured is just one of dozens which are to be configured similarly. We can use IOS's stateful IPsec failover feature to dual-home a single IPsec tunnel from the branch router (R4) to the two distribution routers (R1 and R2) using HSRP and SSO.

tunnel.png

Configure HSRP

First, an HSRP group must be configured on the two distribution routers:

R1(config)# interface f0/0
R1(config-if)# standby 1 name BRANCH-5-TUNNEL
R1(config-if)# standby 1 ip 10.0.0.15
R2(config)# interface f0/0
R2(config-if)# standby 1 name BRANCH-5-TUNNEL
R2(config-if)# standby 1 ip 10.0.0.15

Further HSRP configuration tweaks, such as setting custom timers or adding interface tracking can be accomplished as you would expect (and would be recommended for a real-world deployment).

Verify that HSRP is functioning before proceeding:

R2# show standby
FastEthernet0/0 - Group 1
  State is Active
  2 state changes, last state change 00:00:11
  Virtual IP address is 10.0.0.15
  Active virtual MAC address is 0000.0c07.ac01
  Local virtual MAC address is 0000.0c07.ac01 (v1 default)
  Hello time 3 sec, hold time 10 sec
  Next hello sent in 1.936 secs
  Preemption disabled
  Active router is local
  Standby router is unknown
  Priority 100 (default 100)
  Group name is "BRANCH-5-TUNNEL" (cfgd)

Enabling SSO

Stateful switchover (SSO) is an IOS feature which can provide inter-device service synchronization and stateful failover. Here we'll be using it to provide stateful failover for our IPsec tunnel terminated on the two distribution routers.

First we enable inter-device redundancy for our HSRP (standby) group:

R1(config)# redundancy inter-device
R1(config-red-interdevice)# scheme standby BRANCH-5-TUNNEL

Upon configuring inter-device redundancy, you may receive this notice on one of the routers:

% Standby scheme configuration cannot be processed now group BRANCH-5-TUNNEL
 is not in active state

This simply indicates that this is the standby HSRP router. The router will need to be reloaded before the redundancy scheme configuration can take effect.

Second, we define an Inter-process Communication (IPC) association. IPC configuration can look a bit odd until you become familiar with its hierarchy, but it's actually pretty simple in concept. We'll start by creating a new association to define the redundancy relationship between R1 and R2:

R1(config)# ipc zone default
R1(config-ipczone)# association 1
R1(config-ipczone-assoc)# protocol ?
  sctp  SCTP transport configuration

R1(config-ipczone-assoc)# protocol sctp
R1(config-ipc-protocol-sctp)#

Stream Control Transmission Protocol is used to synchronize state across the routers. We complete the SSO configuration by defining the local and remote end points of the SCTP connection. The physical address of the HSRP interface on each router will be used, but the port number is arbitrary (so long as R1's local port matches R2's remote port and vice versa).

R1(config-ipc-protocol-sctp)# local-port 5005
R1(config-ipc-local-sctp)# local-ip 10.0.0.1
R1(config-ipc-local-sctp)# exit
R1(config-ipc-protocol-sctp)# remote-port 5005
R1(config-ipc-remote-sctp)# remote-ip 10.0.0.2

Repeat this configure on R2, swapping IP addresses where appropriate. Completed, the configurations look like this:

redundancy inter-device
 scheme standby BRANCH-5-TUNNEL
!
ipc zone default
 association 1
  no shutdown
  protocol sctp
   local-port 5005
  local-ip 10.0.0.1
   remote-port 5005
  remote-ip 10.0.0.2
!
redundancy inter-device
 scheme standby BRANCH-5-TUNNEL
!
ipc zone default
 association 1
  no shutdown
  protocol sctp
   local-port 5005
  local-ip 10.0.0.2
   remote-port 5005
  remote-ip 10.0.0.1

R1 and R2 need to be rebooted for the redundancy configuration to take effect.

R1# reload
R2# reload

Upon rebooting, the redundancy configuration on the standby router will trigger a second reload, as indicated by the following log message:

%RF_INTERDEV-4-RELOAD: % RF induced self-reload. my state = NEGOTIATION peer
 state = STANDBY COLD

After the standby router has rebooted a second time, the commands show redundancy states and show redundancy inter-device can be used to verify the redundant operation:

R1# show redundancy states
   my state = 8  -STANDBY HOT
 peer state = 13 -ACTIVE
       Mode = Duplex
    Unit ID = 0

Maintenance Mode = Disabled
   Manual Swact = Enabled
 Communications = Up

client count = 12
 client_notification_TMR = 30000 milliseconds
       RF debug mask = 0x0

R1# show redundancy inter-device
Redundancy inter-device state: RF_INTERDEV_STATE_STDBY
  Scheme: Standby
  Groupname: BRANCH-5-TUNNEL Group State: Standby
  Peer present: RF_INTERDEV_PEER_COMM
  Security: Not configured
R2# show redundancy states
   my state = 13 -ACTIVE
 peer state = 8  -STANDBY HOT
       Mode = Duplex
    Unit ID = 0

Maintenance Mode = Disabled
   Manual Swact = Enabled
 Communications = Up

client count = 12
 client_notification_TMR = 30000 milliseconds
       RF debug mask = 0x0

IKE and ISAKMP

To keep things simple, a minimal IKE/ISAKMP configuration is provided here, using a pre-shared key. Such an implementation is not acceptable for real-world deployments; use asymmetric RSA keys instead.

The following cryptographic configuration will be identical on both R1 and R2:

crypto isakmp policy 1
 authentication pre-share
crypto isakmp key DontUsePresharedKeys address 172.16.0.18
!
crypto ipsec transform-set MyTransformSet esp-aes esp-sha-hmac 

It is important that R1 and R2 be configured with identical cryptographic policies and keys in order for the IPsec tunnel hand-off to succeed in the event of a failure.

A similar configuration is performed on the branch router (R4):

crypto isakmp policy 1
 authentication pre-share
crypto isakmp key DontUsePresharedKeys address 10.0.0.15
!
crypto ipsec transform-set MyTransformSet esp-aes esp-sha-hmac

Crypto Maps

Unfortunately, IPsec stateful failover does not yet support virtual tunnel interfaces (VTIs), so we'll have to make due with crypto maps. For the sake of simplicity, we'll limit encrypted traffic to that between 10.99.0.0/16 and 172.16.5.0/24. The following configuration is applied to R1 and R2:

ip access-list extended BRANCH-5-ACL
 permit ip 10.99.0.0 0.0.255.255 172.16.5.0 0.0.0.255
!
crypto map BRANCH-5-MAP 10 ipsec-isakmp 
 set peer 172.16.0.18
 set transform-set MyTransformSet 
 match address BRANCH-5-ACL
 reverse-route

The reverse-route parameter triggers the creation of a static route for local traffic headed toward the far end of the tunnel. If using a dynamic routing protocol in your lab, be sure to redistribute static routes on R1 and R2 into the protocol as more preferable than the dynamic routes. Otherwise, local traffic may be routed out via the standby router only to be dropped; traffic can only be encrypted by the active router in the pair (as it maintains the active IPsec security associations).

The final bit of configuration on our distribution routers is to apply the crypto map with stateful failover capability:

R1(config)# interface f0/0
R1(config-if)# crypto map BRANCH-5-MAP redundancy BRANCH-5-TUNNEL stateful
R2(config)# interface f0/0
R2(config-if)# crypto map BRANCH-5-MAP redundancy BRANCH-5-TUNNEL stateful

Because encryption works best with something decrypting at the other end, let's add a mirror crypto map to our branch router (R4):

ip access-list extended CORPORATE-ACL
 permit ip 172.16.5.0 0.0.0.255 10.99.0.0 0.0.255.255
!
crypto map CORPORATE-MAP 10 ipsec-isakmp 
 set peer 10.0.0.15
 set transform-set MyTransformSet 
 match address CORPORATE-ACL

Which is applied to the appropriate interface:

R4(config)# interface f0/0
R4(config-if)# crypto map CORPORATE-MAP 

At this point, traffic between 10.99.0.0/16 and 172.16.5.0/24 should be flowing properly through the encrypted tunnel.

Testing Failover

Determine the active router. For the purpose of this lab, it is currently R2:

R2# show redundancy states 
   my state = 13 -ACTIVE 
 peer state = 8  -STANDBY HOT 
       Mode = Duplex
    Unit ID = 0
...

We can observe the failover behavior by shutting down R2's external interface to simulate an outage:

R2(config)# int f0/0
R2(config-if)# shutdown
%HSRP-5-STATECHANGE: FastEthernet0/0 Grp 1 state Active -> Init
%RF_INTERDEV-4-RELOAD: % RF induced self-reload. my state = ACTIVE peer state = STANDBY HOT

While IPsec stateful failover works as advertised, resulting in only minimal traffic disruption during the IPsec association hand-off, it has one little side-effect: the entire router is reloaded. R1 immediately assumes the active role, and R2 eventually reloads to become the hot standby, completing the exchange:

R1# show redundancy states
   my state = 13 -ACTIVE 
 peer state = 8  -STANDBY HOT 
       Mode = Duplex
    Unit ID = 0

While this works, scuttling the current running IOS and reloading the entire router can hardly be considered an elegant response to a simple interface failure. This is particularly limiting in hub-and-spoke topologies like the one examined here; with a dozen branch tunnels terminated on a pair of distribution routers, either could easily be the active router for any number of tunnels.

Given the kamikaze nature of state transitions, coupled with the lack of VTI support, stateful IPsec failover seems unready for real-world deployment.

About the Author

Jeremy Stretch is a network engineer living in the Raleigh-Durham, North Carolina area. He is known for his blog and cheat sheets here at Packet Life. You can reach him by email or follow him on Twitter.

Posted in Security

Comments


Lafcadio Wluki (guest)
August 17, 2009 at 6:46 a.m. UTC

Great article! Tried to implement it before on 2811, but no luck... AFAIK its only supported on 7xxx routers?


Jeff (guest)
August 17, 2009 at 5:36 p.m. UTC

You had me all the way until 'no VTI support'...bummer


roberto@ipnetworks.it (guest)
August 17, 2009 at 5:56 p.m. UTC

Check on feature navigator for "Stateful Failover for IPsec": http://www.cisco.com/go/fn

Cisco 1841/28xx/37XX/38xx/7200/7301

Link


Andrew (guest)
August 17, 2009 at 10:08 p.m. UTC

I'm about to roll this out using 3825's (the lowest end device supported when we first looked at this feature) but it has been difficult to find code that supports all the features we need. I had to put in a TAC case to find code that didn't have broken RRI. (12.4.15T9)

Anyhow once it's up, it seems to work as advertised. Often only 1-3 dropped pings between failover.

I had no idea about VTI, but that's good to know as I would've expected it to be supported. (A common problem with Cisco in my experience)

Anybody work with this technology and dynamic routing? (on the inside) What's the best approach to get routes from this pair to adjacent routers, since the IPSEC is tied to the same device as the active HSRP?


Abe (guest)
August 18, 2009 at 5:17 p.m. UTC

While I'm sure there are possibly other technical reasons to use a router to provide stateful IPSec termination point. Routers never really provided good "stateful" anything till this day...

Also, while I understand this blog focuses primarily on router/Switch technologies.. for this type of application, you really have to consider ASA/PIX platform in HA mode. They provide seamless failover functionality maintaining stateful connections and providing a good IPSec VPN Termination point. It just seems to fit the bill much better than a traditional router platform (off course there are downsides to this: limited dynamic routing functionality, etc).

Maybe all this boils down to having a good design/architectural blueprint. I guess picking the correct platform to do the job at hand is just as important as understanding how to configure a service on a device.

- a


Supernovae
February 22, 2011 at 9:00 a.m. UTC

As this was an entry from august 2009, does anyone know if SSO with VTI is supported now?

~S


jywzh86 (guest)
February 23, 2011 at 5:13 a.m. UTC

I shut down R2's external interface,then R2 is reloaded.After R2 is booted,it becomes the active router again, and the vpn client cannot connect unless I connect it.Why?


morkfard (guest)
April 2, 2011 at 3:27 p.m. UTC

@ jywzh86:
I've run into the same issue when running a quick test on a pair of 2821s as my hubs. Admittedly, I didn't investigate further as to why that happened. Thinking about it now, perhaps the HSRP preemption on the primary router that I typically use when configuring HSRP was the cause.


Bored (guest)
May 4, 2012 at 6:08 p.m. UTC

According to the document I am reading at the moment, this feature only supports PSK for IPSec for RSA.

"Public key infrastructure (PKI) is not supported when used with stateful failover. (Only preshared keys for IKE are supported.)"

http://www.cisco.com/en/US/docs/ios-xml/ios/sec_conn_vpnav/configuration/12-4t/sec-state-fail-ipsec.html#GUID-12B9F69D-EE16-4CB6-81DD-427E9A2AC014


JPI
August 30, 2012 at 8:46 a.m. UTC

Many thanks for this article. It inspired me create a more simpler IPSEC failover setup using smaller Cisco router 881 that do not support SSO.

I have a question regarding routing when using IPSEC failover.

I am using pairs of Cisco 881 at branches in failover setup by applying the crypto map to the HSRP process of the outside interface. This works fine and failover is fast. Routing is static: default route pointing to the internet interface Fa4/WAN. VPN is IPSEC. VPN hub is ASA-5520. All local-LAN traffic is matched by the crypto-acl (source: LOCAL, destination: ANY). LAN routing is static.

Config of the primary:

interface FastEthernet4
description Fa4 outside (x.y.z.148/26)
ip address x.y.z.148 255.255.255.192
ip access-group outside-in in
standby 4 ip x.y.z.147
standby 4 timers msec 500 msec 1500
standby 4 preempt delay reload 120
standby 4 name hsrp-outside
standby 4 track 1 decrement 20
standby 4 track 2 decrement 20
ip tcp adjust-mss 1452
crypto map IPSEC redundancy hsrp-outside
no shutdown
no cdp enable
!
!# track1 tracks lo0 as the manual failover switch
!# track2 tracks the inside interface (line-protocol)

crypto isakmp policy 10
encr aes 256
authentication pre-share
group 5
crypto isakmp key very-secret-psk address ASA.hub.outside.135 no-xauth
crypto isakmp keepalive 20 periodic
!
crypto ipsec security-association replay window-size 256
!
crypto ipsec transform-set ESP-AES-256-SHA esp-aes 256 esp-sha-hmac
crypto ipsec df-bit clear
!
!
!
crypto map IPSEC 10 ipsec-isakmp
set peer ASA.hub.outside.135
set security-association lifetime seconds 28800
set transform-set ESP-AES-256-SHA
match address 101
!

!# crypto-acl:
access-list 101 permit ip branch.local.lan.0 0.0.3.255 any
access-list 101 permit ip branch.local.transfer.176 0.0.0.15 any
access-list 101 permit ip host branch.local.loopback.5 any
access-list 101 permit ip host branch.local.loopback.6 any

ip route 0.0.0.0 0.0.0.0 x.y.z.129 name "Internet"
ip route branch.local.loopback.6 255.255.255.255 branch.local.transfer.179 name "VPN02 Loopback1"
ip route branch.local.lan.0 255.255.252.0 branch.local.transfer.180 name "LAN Networks"

I did not find a solution for the following:

I want to manage both 881 via the active IPSEC tunnel. E.g. ssh to the inside or loopback interface from central site, Tacacs and syslog is sourced from loopback interface which are both inside the tunnel. While primary is crypto-active and fully reachable via VPN tunnel, the secondary has its static default route pointing to the ISPs router connected to at its own outside interface. With this routing the secondary is not reachable via IPSEC tunnel active at the primary - only via non-tunneled outside IP.

-> What is needed is a kind of dynamic or HSRP-state-depending temporary routing for the passive router pointing to the inside of the active router. Or some total different VPN setup? Any ideas?

Btw: should I tweak the MTU at any interface additionally to the TCP-MSS adjustment?

Thanks in advance.


PTP (guest)
February 19, 2015 at 6:04 p.m. UTC

How would it be possible to create a hub and spoke model with static IPSEC HA, IE having two routers on both ends? This assumes the hub is a single router.


Marek (guest)
December 27, 2015 at 3:41 p.m. UTC

Have something changed if VTI in use?

Comments have closed for this article due to its age.