You are on page 1of 35

ClusterXL

Under the hood

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

About author
Valeri Loukine

CCMA 0019 Ex-Check Point Senior Security Consultant - Dimension Data Email: varera@gmail.com Blog: http://checkpoint-masterarchitect.blogspot.com/
2
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Agenda

Understanding the Cluster Elements

CCP State synchronization Pnote

Check Point Solutions: aka ClusterXL (HA ,Load Sharing) Advanced features and problematic scenarios 3rd party clusters Some Troubleshooting

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

CCP

Check Control protocol runs on proto UDP 8116. CCP is running on all interfaces (in Cluster XL)

Note: When VLAN are used CCP will run only on the lowest VLAN ID

(Not true for VSX)

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

CCP is in charge of

Health status reports Cluster member probing State change commands Querying for cluster membership State table synchronization

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

CCP modes
Multicast or Broadcast To change: cphaconf set_ccp STATE $FWDIR/boot/ha_boot.conf The Mac address used for the multicast is
determine with a special algorithm.
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Checking CCP state


# cphaprob a if Required interfaces: 3 Required secured interfaces: 1 eth0 eth1 eth2 UP UP UP non sync(non secured), multicast non sync(non secured), multicast sync(secured), multicast

Virtual cluster interfaces: 2 eth0 eth1 192.168.10.1 10.1.1.1

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

State Sync
Used to exchange kernel table information
between cluster members Full Sync and Delta Sync

Composed of two phases:

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Full Sync
Happens upon boot fwd communication on port 256 Does not have to be on the Sync interface

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Delta Sync
Done over CCP (UDP 8116) Updates changes in kernel tables
incrementally

Happens with every operation done to a


synchronized kernel table

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

How it works
When cluster members starts, it requests
and existing connections information

Full Sync before becoming Standby Member

Full Sync replicates all existing kernel tables

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

How it works
Upon FS completion cluster member
changes its state to Standby

From now on, Delta Sync occurs Only changes are synced Some may be not synced, congurable
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Tuning Sync

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Tuning Sync

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Sync summary
Global - supports all kernel table operations
does not require Transparent - its existence direct awareness of

Serves both ClusterXL and third parties without signicant changes


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Sync summary
User mode applications information is not synced!
(Security Servers, etc)

May require some performance and bandwidth


administrator can Can be tuned not to be synced choose some services

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

fw ctl pstat - sync


Sync: Version: new Status: Able to Send/Receive sync packets Sync packets sent: total : 209693, acks : 54 retransmitted : 166, retrans reqs : 129,

Sync packets received: total : 134755, were queued : 221, dropped by net : 101

retrans reqs : 29, received 26 acks retrans reqs for illegal seq : 0 dropped updates as a result of sync overload: 0 Callback statistics: handled 11 cb, average delay : 1, delay : 1 max

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Under the hood

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Pnote
critical device AKA a Problem
Notication (pnote) dened as a Failure

If a critical device stops functioning, this is fwd , cphad are predened also checked: policy (lter) , sync and
interfaces
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Pnote
To check: cphaprob

list

Can be used to cause a failover by adding a


new faulty device

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

cphaprob list
Built-in Devices: Device Name: Interface Active Check Current state: OK Registered Devices: Device Name: cphad Registration number: 2 Timeout: 2 sec Current state: OK Time since last report: 0 sec Device Name: fwd Registration number: 3 Timeout: 2 sec Current state: OK Time since last report: 0.8 sec

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Register new device


cphaprob
-d <device> -t <timeout(sec)> -s <ok|init| problem> [-p] register

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Clusters basic requirements

OS must be the same. FW-1 version must be the same. Installed products must be the same.
NOTE : Check Point recommends that customers use the same hardware.
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

ClusterXL basics

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

ClusterXL
CP clustering product (CCP), UDP 8116 Same for both HA and LS solutions Supports Solaris, SPLAT and Linux, not IPSO 4 modes of operation HA Legacy and New LS Multicast and unicast!
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA new mode

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA new mode
Active - Standby roles CCP runs on multicast by default Active member answer whois ARP for VIP
with its physical MAC address

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA new mode
Sync is done If Active fails, Standby takes over and
becomes Active

By default no secondary failover CCP can be switched to unicast (ooding


VIP segment)
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA new mode
#cphaprob stat Cluster Mode: Number 1 (local) 2 New High Availability (Active Up) Assigned Load 100% 0% State active standby

Unique Address 172.18.100.5 172.18.100.6

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA legacy mode

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

HA legacy mode
Linux only Both members are congured to have same
IP addresses and SAME MAC addresses on clustered interfaces

Managed through private interfaces


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS multicast

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS multicast mode
Both members process trafc whois is answered with virtual multicast
MAC shared among members

All members receive the packet Random decision to process


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS multicast mode
#cphaprob stat Cluster Mode: Number 1 2 3 (local) Load Sharing (Multicast) Assigned Load 33% 33% 33% State active active active 192.10.0.1 192.10.0.2 192.10.0.3

Unique Address

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS pivot (unicast)

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS pivot mode
Pivot always answers whois with its physical
MAC

It always get packets, but forwards some of


them to other cluster members

Forwarding is done on receiving network,


original source MAC is replaced

Load is not equally shared


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

LS pivot mode
#cphaprob stat

Cluster Mode: Number 1 (local) 2

Load Sharing (Unicast) Assigned Load 30% 70% active active State (pivot)

Unique Address 10.10.10.57 10.10.10.61

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Advanced parameters

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Advanced parameters
Asymmetric Routing Session from standby (Forwarding) Block new Conns Different subnet Magic MAC Disconnected interfaces
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Asymmetric Routing

C2S packet goes through one cluster member S2C packet goes through another

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Asymmetric Routing
Whats the problem?

Race conditions (syn/syn-ack/ack) Features without sync (Security Servers) NATed and encrypted connections Data connections

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Asymmetric Routing
Resolution:

Flush and Ack mechanism hold a packet that made a change in the kernel table until the change is synced successfully Sticky Decision Function

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Decision Function

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

SDF - when?
FTP - The data connections are passed
and Hide NAT through the same cluster member as the control connection

NATed connections, including Static NAT VPN, including encrypted connections


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

generated from SecuRemote/SecureClient or from another VPN gateway.


(c) Valeri Loukine 2011

SDF - limitations
Some connection types are not recognized
by SDF- default DF will be used acceleration will be stopped

SDF does not work with SecureXL Does not work for VPN routing
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Session from Standby


If a session start from Standby: To the server it will go directly from
Standby

From the Server it will go to Active


member , then it will forward the connection to Standby

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Block new conns


If sync is at risk, new connections should not be processes. Error message:
FW-1: State synchronization is in risk. Please examine your synchronization network to avoid further problems!

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Block new conns


fw_sync_block_new_conns Enable load detection - set to 0 Disable load detection - set to -1 FW-1 default is -1 ,VSX the default is 0
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Different Subnet
When VIP is not on the same subnet as
physical member IP addresses required

Automatic ARP is not supported. local.arp May need some additional static routes
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Magic MAC
Used by CCP on Layer 2 Belongs to all members on all interfaces Forward MAC is used to forward packets

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Magic MAC
fwha_mac_magic 0xfe fwha_mac_forward_magic 0xfd

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Disconnected Interfaces
Interfaces that do not run CCP Sync Interface must NOT be
disconnected
rd

In 3

party all interfaces except for the sync interface

Will not be monitored


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Disconnected Interfaces
$FWDIR/conf/discntd.if, reboot May dene in topology as private No need to list them in 3rd party
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

rd 3

party clusters

Were many vendors Now Crossbeam and IPSO, what else? ClusterXL - only Sync

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Cluster member state

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Cluster member state


Active Active Attention Down Ready Standby Initializing
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Active
Everything is good Passing trafc

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Active Attention
Something is wrong in the cluster I am passing trafc

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Down
One of the critical devices is down Not passing trafc

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Ready
Upgraded, old version member is Active Not passing trafc

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Standby
Everything is good Not passing trafc

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Initializing
Cluster member is booting up, ClusterXL product is already running VPN-1 Pro is not yet ready Full Sync is not completed
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Troubleshooting tools

CPUG 2011 Chur Switzerland


Wednesday, September 14, 2011

(c) Valeri Loukine 2011

CLI
cphaprob
list -a if state

fw ctl pstat (check sync data) fw ctl debug m cluster xxx


CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

fw ctl debug ags



Wednesday, September 14, 2011

conf Conguration related kdebug messages if - Interface tracking and validation stat - Cluster module state change select - Packet selection including DF ccp Cluster control packet handeling pnote - Pnote device
(c) Valeri Loukine 2011

CPUG 2011 Chur Switzerland

fw ctl debug ags


mac mac address sync forward forwarding layer debug df decision function drop drops caused by SDF
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Other tips
Snoop (still using UDP port 8116 trafc) fw monitor (forwarded packets may
cause confusion)
CPUG 2011 Chur Switzerland
Wednesday, September 14, 2011

(c) Valeri Loukine 2011

Questions And Answers

Wednesday, September 14, 2011

Thank You For Your Time!

Wednesday, September 14, 2011