Sie sind auf Seite 1von 35

Design Considerations for Citrix

XenApp/XenDesktop 7.6 Disaster Recovery


Citrix Solutions Lab White Paper
This paper examines the issues and concerns around building a disaster recovery plan and solution, the
possible use cases that may occur, and how a team of engineers within the Citrix Solutions Lab
approaches building a disaster recovery solution.

November 2015

Prepared by: Citrix Solutions Lab

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Table of contents
Section 1: Overview ................................................................................... 5
Executive summary ...................................................................................................... 5
Audience ................................................................................................................... 5
Disaster Recovery vs. High Availability ........................................................................ 6
Defining types of Disaster Recovery ......................................................................... 6
Defining what is critical .............................................................................................. 7

Section 2: Defining the Environment .......................................................... 8


Service Descriptions..................................................................................................... 9
User to Site Assignment ............................................................................................. 10
User Counts by Region ........................................................................................... 10
Regional Site 1 Network Diagram ........................................................................... 11
Regional Site 2 Network Diagram ........................................................................... 12
Cold DR Site Network Diagram ............................................................................... 13
Software ..................................................................................................................... 14
Hardware .................................................................................................................... 14
Servers .................................................................................................................... 14
Network ................................................................................................................... 15
Storage....................................................................................................................... 16
Use Cases.................................................................................................................. 17

Section 3: Deployment ............................................................................. 18


Configuration Considerations ..................................................................................... 19
Region Server Pools .................................................................................................. 22
Failover Process ........................................................................................................ 25

Section 4: Conclusion............................................................................... 27
Section 5: Appendices.............................................................................. 28
Appendix A ................................................................................................................. 28
citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

References .............................................................................................................. 28
Appendix B ................................................................................................................. 29
High Level Regional Diagrams ................................................................................ 29
Appendix C ................................................................................................................. 31
Identifying Services and Applications for DR/HA ..................................................... 31

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Section 1: Overview
Executive summary
There is much conversation around executing disaster recovery for a data center, and utilizing high
availability wherever possible. However, what are the requirements around disaster recovery, and how
does it differ from high availability? How do they work together to ensure your systems and applications
are up and available, no matter what?
This white paper looks at understanding disaster recovery and high availability. As with most things in life,
there are trade-offs. The more resilient to failure you want to be, the more it is going to cost. How do
these trade-offs affect you? There is the old-fashioned approach of writing everything of importance to
tape, storing the tape off-site, and waiting for a disaster to occur. Tape is a very low cost option, but it
could take days or weeks to rebuild your environment. The other end of the spectrum comes from utilizing
todays technology and making everything active/active, essentially running two complete data centers in
two different locations. The two data centers option is an extremely resilient, but also extremely costly
option. Simply, you are betting you are going to have a disaster that affects at least one of your sites.
What exactly needs to be up and running as quickly as possible after a failure of your data center? Where
does high availability come into play to help? This document looks at some of these questions, and asks
a few more, to help you understand and make good decisions in building a disaster recovery plan.
This project is not looking at sizing, scaling, or performance, but at design considerations for disaster
recovery. In the Solutions Lab, a team of engineers including lab hardware specialists, network
specialists, storage specialists, architects, and Citrix experts were challenged to build a disaster recovery
solution for a fictitious company defined by Solutions Lab Management. This document shows how the
company was defined, how the team architected and then implemented a solution, and some of the
issues and problems they uncovered as flaws in their plan or things they did not expect or anticipate. The
end result plan was compared to how companies such as Citrix handle disaster recovery, and it was
found to be very similar. The team had an advantage in that they were able to build the company data
center to fit their design, not try to fit a design to an existing data center. Hopefully what they learned and
uncovered will assist you as you think about building your own disaster recovery plan.
Note that a major component of any disaster recovery solution is the storage and storage vendor used.
The concerns are around the amount of data to be moved between the sites and the acceptable delta
between data synchronizations. For this paper, we worked with EMC, utilizing their storage solution to
achieve our defined goals.

Audience
This paper was written for IT experts, consultants, and architects tasked with designing a disaster
recovery plan.

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Disaster Recovery vs. High Availability


Before we can proceed, we need to align on some definitions and terms. For this paper, High Availability
(HA) is focused more on the server level, and is configured in such a manner that the end user
experiences little to no down time. Recovery can be automatic, simply failing over to another host, server,
or instance of the application. HA is often thought of in terms of N+1, the addition of one more server
(physical or virtual) or application than is required. If five physical servers were required to support the
workload, then six would be configured with the load distributed across all six servers. If any single server
fails, the remaining five can pick up the workload without significantly affecting the user experience. With
software like Citrix XenDesktop, the same approach applies. If one delivery controller/broker, provisioning
server, or SQL server is not sufficient to support the workload, a second one is deployed. Depending on
the software, this can be either Active/Active where all are actively processing, or Active/Passive where
the HA only becomes active in failure of the first system. In XenDesktop, we always recommend an
Active/Active deployment.
Disaster Recovery (DR) implies a complete disaster, no access to the site or region, total failure. The
recovery will require manual intervention at some point, and the response times for being operational
again are defined by the disaster recovery specifics. We will talk more about this later in this paper.

Defining types of Disaster Recovery


For HA, we talked in terms of Active/Active and Active/Passive, where these terms define how the HA
components act, either all are up and supporting users or one is awaiting a failure event and then
proceeds to pick up the load. These terms can be applied to DR as well:

Active/Passive (A/P) referred to as planned or cold


o

Once a disaster strikes the second site must be brought up entirely

Only as current as the last back up

Could have hardware sitting idle waiting for disaster

Active/Active (A/A) referred to as hot sites


o

Everything replicated in the disaster site

Duplicate hardware

Everything that occurs on the primary site also occurs on the secondary site

Load balanced

Active/Warm (A/W) referred to as reactive or warm


o

Some components online, ready

Must define priority recovery

When disaster occurs, provision capacity as needed

In A/P, depending on how quickly you need to be back up and running, it may be as simple as backing up
to tape and in a disaster restoring from tape to available hardware. This is the lower cost solution, but not
very resilient or quick for recovery. A/A has duplicate hardware and software running and supporting
users. In a multi-site scenario, each site must have enough additional hardware to support the user
failover. A/A is much quicker to recover from a disaster, but much more expensive from a Capital
Expenditure (CAPEX) cost with hardware. Essentially, each site has a complete duplicate set of underutilized hardware waiting for a disaster. With A/W, the plan is to define that which is critical to the
company and what must be recovered as quickly as possible and having enough bandwidth at the other
site(s) to support the requirement. Once the most critical environment is defined, the rest of the company
can be dealt with. This does require some extra hardware in each region, but we can better manage the
resources and costs.

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Defining what is critical


In an A/A deployment, the thought is that everything is critical, and must be up and running. In an A/P
deployment, critical uptime is not important. However, for A/W we must define what is critical, and which
users are critical. The following terms are used going forward:

Mission Critical (MC) Highest Priority


o

Requires continuous availability

Breaks in service are very impactful on the company business

Availability required at almost any price

Mission critical users are highest priority in event of a failure

Business Critical (BC) High Priority


o

Requires continuous availability, though short breaks in service are not catastrophic

Availability required for effective business operation

Business critical users have a less stringent recovery time

Business Operational / Productivity (PR) Medium Priority


o

Contributing to efficient business operation but doesn't greatly affect business

Regular users, may not fail over, or done so as final steps

As stated earlier, we created a fictitious company for this disaster recovery plan scenario. This company
has a single Mission Critical application and a single Business Critical application, and associated users.
The company president defined the acceptable response times and requirements, including a desire to
have a warm failover for mission- and business-critical users, and a passive failover for the rest of the
company. The following sections highlight the development and implementation of the plan.

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Section 2: Defining the Environment


For this setup, the fictitious business was structured as one business with two regional sites. The
business requires both company database (which is considered Mission Critical) and Exchange (which is
considered Business Critical) availability. Region 1 focuses on company infrastructure and Region 2
focuses on a call center. MC and BC users are spread across multiple groups in each region. This setup
must also be able to handle the total failure of both Regions 1 and 2 at the same time.
In a single region failure, the recovery goals for our setup are for MC applications and users to be back up
and running within two hours with minimal data loss. BC applications and users must be back up and
running within four hours with up to 60 minutes of acceptable data loss. If Regions 1 and 2 both fail, the
third site must be up and running within five days with no more than 24 hours of acceptable data loss.

For a closer look at this diagram by region, see Appendix B at the end of the paper.

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Service Descriptions
This table defines our MC, BC and PR services and applications and our considerations in handling them
in our setup.

Service
Type

Service

Description

Configuration

Requirements

Mission
Critical

Microsoft SQL
Sample
Database Northwind

SQL Sample
Northwind Database
is used along with a
web server. This
represents the Call
Center mission
critical application
database.

SQL Sample database is


deployed at all locations.
Replication is handled by
storage backend.

In case of major failure, the


database must be delivered
from the DR data center.

Microsoft
Exchange /
Outlook

Access to email data


for business critical
users from Exchange
database

The database is replicated


between primary and
secondary locations using
Exchange database copies.

Business
Critical

A maintenance message must


be presented to external users
when the database is not
available.

The database is backed up


every 4 hours to the storage
in DR location.
Business
Microsoft Office
Operational and file shares
/
Productivity

All users use


Microsoft Office to
create and review
documents.
Documents are
stored on file server
shares synced
between regions.

DFS Replication is
In case of disaster, a limited set
configured between primary of users must have access to
sites and file-based backup the DR file share location.
is performed to the DR
location every 8 hours.
Published Microsoft Office
must be unavailable to users
when the file share is not
available.

Microsoft Office is
published on
XenApp.

citrix.com

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

User to Site Assignment


Each regional site in our setup has different type of users. Region 1 is focused on HR and engineering.
Region 2 is focused on call center users. A majority of the users are hosted shared desktops, the
remaining users are VDI users, either pooled or dedicated.

User Counts by Region


The table below shows the breakdown of users by region and how they are organized within the regions.

Region 1 User Counts

Mission Critical

Business
Critical

Business
Operational / PR

Engineering

30

60

560

HR

10

10

20

Management

Region 1 Grand Total

45

75

580

Region 2 User Counts

Mission Critical

Business
Critical

Business
Operational / PR

Call Center

20

60

520

Engineering

10

HR

25

Management

Region 2 Grand Total

40

90

citrix.com

50

570

10

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Regional Site 1 Network Diagram

For Region 1, the server configuration consisted of:

Three physical servers running XenServer, hosting infrastructure VMs.

Four physical XenApp hosts in a single delivery group, as a 3+1 HA model supporting the
business operational users.

Four physical hosts running XenServer configured as a pool, in a 3+1 HA model supporting the
mission- and business-critical users. This pool supported the following configuration:

30 Windows 8.1 Dedicated VDI VMs

90 Windows 8.1 Random Pooled VDI VMs

5 Windows 2012 R2 Multi-user XA/HSD VMs supporting 80 users

The Region 2 failover pool in Region 1 is four XenServer hosts in a 3+1 model supporting the
following configuration:

citrix.com

25 Windows 8.1 Dedicated VDI VMs

25 Windows 8.1 Random Pooled VDI VMs

5 Windows 2012 R2 Multi-user XA/HSD VMs supporting 80 users

3 SQL 2014 VMs in a cluster (Call center database failover)


11

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Regional Site 2 Network Diagram

For Region 2, the server configuration consisted of:

Three physical servers running XenServer, hosting infrastructure VMs, including the SQL call
center cluster.

Four physical XenApp hosts in a single delivery group, as a 3+1 HA model supporting the
business operational users.

Four physical hosts running XenServer configured as a pool, in a 3+1 HA model supporting the
mission- and business-critical users. This pool supported the following configuration:

25 Windows 8.1 Dedicated VDI VMs

95 Windows 8.1 Random Pooled VDI VMs

5 Windows 2012 R2 Multi-user XA/HSD VMs supporting 80 users

The Region 1 failover pool in Region 2 is four XenServer hosts in a 3+1 model supporting the
following configuration:

citrix.com

30 Windows 8.1 Dedicated VDI VMs

10 Windows 8.1 Random Pooled VDI VMs

5 Windows 2012 R2 Multi-user XA/HSD VMs supporting 80 users


12

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Cold DR Site Network Diagram

For the DR site, the Region 1 disaster recovery site was set up with four XenServer hosts in a 3+1 HA
model supporting the following configuration:
o

Windows 8.1 Dedicated VDI VMs

Windows 2012 R2 Multi-user XA/HSD VMs

Infrastructure VMs

The Region 2 disaster recovery site was set up with four XenServer hosts in a 3+1 HA model supporting:
o

Windows 8.1 Dedicated VDI VMs

Windows 2012 R2 Multi-user XA/HSD VMs

Infrastructure VMs

Note: The networks for Region 1 and Region 2 in this site are set up with the same IP ranges as in the
original regional sites.

citrix.com

13

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Software
The following is a list of software components deployed in the environment:
Component

Version

Virtual Desktop Broker

XenDesktop 7.6 Platinum Edition FP2

VDI Desktop Provisioning

Provisioning Services 7.6

Endpoint Client

Citrix Receiver for Windows 4.2 (ICA)

Web Portal

Citrix StoreFront 3.0

License Server

Citrix License Server 11.12.1

Office

Microsoft Office 2013

Virtual Desktop OS (Pooled VDI)

Microsoft Windows 8.1 x64

Virtual Desktop OS (Hosted Shared Desktops)

Microsoft Windows Server 2012 R2 Datacenter

Database Server

Microsoft SQL Server 2014

Hypervisor

XenServer 6.5 SP1

Network Appliance

NetScaler VPX, NS11.0: Build 62.10.nc

WAN Optimization

CloudBridge WAN Accelerator


CBVPX 7.4.1

Storage Network

Brocade 5100 switch

Storage DR

For XtremIO: EMC RecoverPoint 4.1 SP2 P1


For Isilon: OneFS 7.2 SyncIQ

Note: All software is updated to run the latest hotfixes and patches

Hardware
Servers
The hardware used in this configuration were blade servers with 2-socket Intel Xeon E5-2670 @
2.60GHz, with 192 GB of RAM and two internal hard drives.

citrix.com

14

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Network
VMs were utilized as site edge devices that helped route traffic between regions. The perimeter network
(also known as a DMZ) had a firewall between itself and the internet and another firewall between the
perimeter network and production network.
NetScaler Global Site Load Balancing (GSLB) was used to determine which region the user is sent. If
available, users are sent to their primary region. When the primary region is not available, users are sent
to their secondary region. A pair of NetScaler VPX appliances per region were utilized for authentication,
access, and VPN communications. Additionally, a pair of NetScaler Gateway VPX appliances were
utilized per region to allow connectivity into the XenApp/XenDesktop environment. CloudBridge VPX
appliances were utilized for traffic acceleration and optimization between regions. NetScaler CloudBridge
Connector was configured for IPSec tunneling.
The following diagram is a detailed architectural design of our network implementation.

citrix.com

15

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Storage
Storage was configured using EMC XtremIO All-Flash Storage and Isilon Clustered NAS systems.
Storage Network for EMC XtremIO was configured with Brocade Fibre Channel SAN switches. The
following diagram gives a high level view for Region 1. As stated previously, failover to a DR site requires
manual intervention, so the concern in syncing data comes down to a math problem. How much data do
you need to sync between sites and what size pipe between the sites? That determines how long it will
take to sync. Can you sync in the time allowed? If not, what do you have to correct the problem, reduce
the amount of data or increase the pipe speed?

One thing to look at is the LUNs, or storage repositories. Our design created multiple volumes for mission
critical data and business critical data, and scheduled syncs accordingly. It is crucial that you work with
the storage vendor to get the proper configuration.

citrix.com

16

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Use Cases
The following use cases define the possible scenarios that must be considered and, for our case study,
the users that must be supported. The minimum implies the mission critical and business critical users
that need to be supported.
Use Case 1

The sites are configured as Active/Active using NetScaler GSLB.

If the Region 1 site fails, mission- and business-critical users will be able to connect and log on to
the Region 2 site with the same data resources as were available in the Region 1 site.

With the Region 1 site back online, NetScaler GSLB will direct users to the correct site, as Region
1 site users log off from the Region 2 site and then log back into the Region 1 site.

A maximum of 120 users will have warm HA failover capability from Region 1 to Region 2.

Use Case 2

The sites are configured as Active/Active using NetScaler GSLB.

If the Region 2 site fails, mission- and business-critical users will be able to connect and log on to
the Region 1 site with the same data resources as were available in the Region 2 site.

With the Region 2 site back online, NetScaler GSLB will direct users to the correct site, as Region
2 site users log off from the Region 1 site and then log back into the Region 2 site.

A maximum of 130 users will have warm HA failover capability from Region 2 to Region 1.

Use Case 3 Cold DR

The sites configured as Active/Passive, with the goal of failing over only the mission critical users
from the Region 1/Region 2 sites to the DR site.

This site will be based on backup data from Region1 and Region 2 and will go live within 5 days.

Manual process to switch to the DR site

When users login to the DR site, they should have any changes/modifications in their
dedicated environment in the DR site environment. There is potential of data loss between
the last site to site copy and the failover. Once failed over to DR site, when Region 1/Region
2 return online, and after allowing appropriate time for replication between sites, login should
connect to Region 1/Region 2 and the changes should be reflected there.

The cold DR site will contain subset of the regional sites including networking, infrastructure
and dedicated VDIs.
o

citrix.com

This approach allows us to both easily recover from disaster with backups, and later
rebuild regional sites from the DR site data.

Mission Critical users will have primary access to the cold DR site, followed by Business
Critical, and then the rest of the company depending on timelines and disaster impact.
o

A maximum of 45 users will have cold DR access from Region 1.

A maximum of 40 users will have cold DR access from Region 2.

17

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Section 3: Deployment
In building this configuration, this document is not a step by step manual, but a guide to help understand
what needs to be done. Wherever possible, Citrix documentation was followed around deployment and
configuration. The following configuration sections highlight any deviations or areas of importance to help
with a successful deployment.
Implementing the software breaks down to two major areas. First, putting the correct software into each
region. Second, configuring NetScaler for GSLB.
The process followed for deployment was:
1. Deploy XenServer pools.
2. Create required AD groups and DHCP scopes.
3. Prepare SQL Environment (SQL AlwaysOn). PVS 7.6 adds support for AlwaysOn.
4. Deploy XenDesktop environment.
5. Deploy Storefront servers and connect to XenDesktop.
6. Deploy PVS environment and create required vDisks.
7. Configure NetScaler GSLB, create site and service.
8. Configure NetScaler Gateway in Active/Passive mode and update Storefront configuration.
9. Deploy Microsoft Exchange Environment.

The NetScaler configurations are straightforward, there was nothing special done with configuring
StoreFront. This was a typical XenDesktop and NetScaler Gateway configuration. Two StoreFront servers
were configured to be load balanced by NetScaler.
NetScaler GSLB is where the focus is:

Using LB Method StaticProximity: Region 1 users will be sent to Region 1 if it is online,


otherwise the users will be set to Region 2 and vice versa.

Using location settings in NetScaler to define the primary regions of the clients local DNS
Servers and for the GSLB sites and services.

Users regardless of region use the same Fully Qualified Domain Name (FQDN) (i.e.
desktop.domain.com) NetScaler running ADNS will answer authoritatively with the IP of
primary site.

Once the user is redirected to the proper site, the user authenticates at AG, and is then
redirected to local StoreFront to get access to resources.

Additionally, NetScaler CloudBridge Connector is configured for IPSec tunneling:

citrix.com

An IPSec tunnel for AD replication, server/client communication is created using the


outbound connection.

A second IPSec tunnel is created for site to site data replication.

18

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Configuration Considerations
The following defines some of the specific configurations applied to environment:
XenApp/XenDesktop

Regional Sites R1 and R2


o

2 Delivery Controllers primary regional site

FMA Services to have SSL on Controllers and change XML Service ports from HTTP
to HTTPS ports to secure traffic communication

XD/XA Database on the Always On SQL group

SSL to VDA feature of XenApp and XenDesktop 7.6

Hosted shared desktops

5 Machine Catalogs

Physical XA HSD

XA HSD MC

XA HSD BC

XA HSD MC Failover

XA HSD BC Failover

5 Delivery Groups matching the catalogs

Pooled VDI desktops

4 Machine Catalogs

PR

BC

PR Failover

BC Failover

4 Delivery Groups

Dedicated VDI Desktops

citrix.com

Must have unique Site Database naming

4 Machine Catalogs

MC

BC

MC Failover

BC Failover

4 Delivery Groups

19

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


VDI Virtual Desktops
o

Pooled Random VDI Desktops

VDI VMs Streamed from PVS vDisk

Dedicated VDI Desktops

Static VMs

My Documents must be redirected to a network location on File Share

XenApp / HSD
o

Deployed in two models

Physical Hosts in N+1 HA Model manually installed on hardware

Virtualized XA HSD VMs in N+1 HA model streamed from PVS vDisks

User Profile Manager


o

Hosted Shared Desktop: User Profile Data: \\FS01\ProfileData\HSD\


#SAMAccountName#

Hosted Virtual Desktop: User Profile Data: \\FS01\ProfileData\HVD\


#SAMAccountName#

Hosted Virtual Desktop: User Profile Data: \\FS01\ProfileData\MC\


#SAMAccountName#

User Profile and Folder redirection Policies

StoreFront VMs
o

SSL configured to secure traffic communication

2 StoreFront Servers (HA) and LB by NetScaler VPX

Authentication is configured on NetScaler Gateway.

License Server VM

citrix.com

2 HA license servers

SSL configured to secure traffic communication

Windows 2012 RDS Licenses

Citrix Licensing Server

20

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Isilon Scale-out NAS for each Site
o

4 - X410 Nodes

34 TB HDD + 1.6 TB SSD

128 GB RAM 8x16 GB

Provisioning Services
o

2 PVS Server VMs in HA

PVS DB Server configured on SQL AlwaysOn

Utilizing remote storage location for vDisks on each PVS remote storage attached
to PVS VMs as 2nd drive via File Server and SMB/CIFS.

Separate locations for vDisk store for Mission Critical and Business Critical
vDisks on File Server via SMB/CIFS

Regular vDisks located on local File Servers

Multihomed

Utilizing Guest VLAN as Management interface

Utilizing the PXE VLAN for Streaming interface

DHCP for PVS network/PXE VLAN

Cache in device RAM with overflow on hard disk

256MB for Windows 8.1 VDI

2048MB for XA HSD

NetScaler VPX VMs


o

citrix.com

2 LB VPX in HA mode

LDAP Authentication

AG VIP

VPN

GSLB for regional sites

2 - VPXs for LB of StoreFront and XML services

21

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Region Server Pools


The following defines the VM breakdown per region for the different pools required within the
infrastructure environment. In all cases, the VMs were balanced across XenServer hosts, and VMs were
configured in an HA model; a minimum of two VMs for each required application.
Region 1:

2 XenDesktop Brokers

2 - StoreFront VMs

2 - License Server VMs

2 - Provisioning Services

2 - File Server VMs

3 - SQL 2014 Database Server VM Always On

2 AD DC VMs

4 - Exchange server VMs

2 Mailbox

2 Client Access

Perimeter Network

1 Firewall / Router VM

2 - NetScaler VPX VMs HA Model User Access

2 CloudBridge VPX VMs - HA Model - Active/Passive Site to Site user access WAN
optimization

2 CloudBridge VPX VMs - HA Model - Active/Passive Site to Site data replication

2 - NetScaler VPX VMs HA Model Data Replication

R2 HA Fail-Over Pool

citrix.com

5 XA HSD VMs

25 Pooled VDI VMs

25 Dedicated VDI VMs

3 SQL Server VMs (Call Center Cluster)

22

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Region 2:

2 XenDesktop Brokers

2 - StoreFront VMs

2 - License Server VMs

2 - Provisioning Services

3 SQL 2014 Database Server VMs SQL Cluster

3 - SQL 2014 Database Server VM Always On

2 - File Server VMs

2 AD DC VMs

4 - Exchange server VMs

2 Mailbox

2 Client Access

Perimeter Network

1 Firewall / Router VM

2 - NetScaler VPX VMs HA Model User Access

2 CloudBridge VPX VMs - HA Model - Active/Passive Site to Site user access WAN
optimization

2 CloudBridge VPX VMs - HA Model - Active/Passive Site to Site data replication

2 - NetScaler VPX VMs HA Model Data Replication

R1 HA Fail-Over Pool

citrix.com

5 XA HSD VMs

10 Pooled VDI VMs

30 Dedicated VDI VMs

23

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Region 3:
Infrastructure Pool to support Region 3

2 AD DC VMs

1 VM to handle backups from regions 1 and 2

Region 1 Infrastructure Pool

2 AD DC VMs

2 Delivery Controllers

2 StoreFront VMs

2 License Server VMs

1 File Server VMs

2 SQL 2012 Database Server VM Always On

4 Exchange server VMs

2 Mailbox

2 Client Access

Region 2 Infrastructure Pool

2 AD DC VMs

2 Delivery Controllers

2 StoreFront VMs

2 License Server VMs

1 File Server VMs

2 SQL 2014 Database Server VM Always On

2 SQL 2014 Database Server VM SQL Cluster

4 Exchange server VMs

2 Mailbox

2 Client Access

Perimeter Network

1 Firewall / Router VM

2 NetScaler VPX VMs R1/R2 Access

VIP per Region 2 VIPs

Note: The infrastructure VMs for regions 1 and 2 were duplicated in region 3 for networking purposes. By
setting the networks correctly in region 3, once regions 1 and 2 were brought up, no network changes
were required in their infrastructure or VHD files.

citrix.com

24

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Failover Process
The dedicated VMs present the biggest challenge in a failure. To address this, VMs are created in both
regions for the failover dedicated VMs from the other region. However, no storage is attached to these
VMs. In the event of a failure, these VMs will be assigned the proper VHD file from the backup storage
location. It should also be noted that for fail-back after the failed region is back online, the dedicated VM
VHD files will be deleted in the failed region and copied back from the failover region and attached to the
proper VM. This ensures the latest version of the dedicated VMs will be restarted after the fail-back.
Note: In dealing with dedicated VMs, we realized that we had to carefully name the VHD files and
associated files to ensure connecting the correct VHD file to the correct VM in failover and fail-back.
If there is a failure in either Region 1 or Region 2 (whats called a warm failover), a few steps need to be
taken. The actions differ depending on the failure. If it is a network access issue, or the Internet is down,
the dedicated VMs in the failed region are placed in Maintenance Mode in Citrix Studio and shut down.
The latest storage backup of the dedicated VMs in the new region must be made available and the
storage for each VM needs to be attached individually to the pre-created VMs already present. Group
policy is applied to the dedicated VMs OU which import the registry value, listing the delivery controllers
host names, allowing VDA registration with the local delivery controllers. The pooled VDI and XA HSD
VMs on the local delivery site are also taken off Maintenance Mode and brought online.
For Region 2, the SQL database for the call center application is brought online as well. Depending on
the type of failure, you may need to power down the failed region firewall to force failover to the other
region.
Once those steps are completed, you boot Mission Critical User VMs and Business Critical User VMs.
Mission- and Business-Critical data is kept in sync between the sites. You can then communicate the
availability to your users. The end users use the same URL as always, with GSLB redirecting as required.
For fail-back after recovery of the failed region has completed, the steps are to sync all storage back to
the failed site, perform the necessary steps for the dedicated VMs, bring the applications back online, and
bring up the users.
In a full loss of both Regions 1 and Regions 2, the DR site, or Region 3, needs to be brought online. The
physical servers are powered up, making the XenServer pools accessible. The latest database and
Exchange information are imported and the infrastructure for user VDI VMs should be restored and
brought online. A new URL is required to log in. Once the site has been brought online, any new
information, like a new URL for access, needs to be given to your users.

citrix.com

25

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


The following defines steps required to recover and bring Region 3 as defined back online:

Active Directory, DNS and DHCP


o

Import Domain Controllers from backup and restore Active Directory functionality

Update DNS Records for Storefront / Access Gateway / Exchange MX

Create DHCP Scopes

NetScaler
o

XenServer
o

Restore access to file services, user data and UPM.

XenDesktop Environment
o

Import SQL VMs and restore XenDesktop, PVS and Call Center application databases

Import StoreFront, XenDesktop and PVS VMs and test connectivity to databases

Exchange Environment
o

Turn existing XenServer Pools on

File Services
o

Rebuild NetScaler components, NetScaler Gateway

Import Client Access and Mailbox Servers and restore databases

External DNS

citrix.com

Update External DNS records for Access Gateway URLs

Update External MX records for email

Update Outlook Anywhere, Active Sync, etc. DNS records

26

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Section 4: Conclusion
As stated in the beginning, the goal of this project was to challenge a group of engineers with creating a
disaster recovery plan for a fictitious company. This meant understanding what was mission critical,
business critical, and normal day-to-day work, and what applications and data needed to be ready in case
of a disaster. This also meant understanding user needs for issues like dedicated VMs. This paper
highlights and defines some of the issues around creating a disaster recovery environment. This is not a
how-to, step-by-step manual, but a guide to help you understand the issues and concerns in doing
disaster recovery, and things to consider when defining your disaster plan. It shows you how the Citrix
Solutions Lab team of engineers defined, designed, and implemented a DR plan for a fictitious company.
This may not be the optimal solution overall for your company, but it is one that you can utilize as a base
line of considerations and operational steps to be used when you create your disaster recovery plan for
you company.
During the process of deploying and testing, there were some realizations and changes made. One of the
first was around failing back after a failover; how to handle the data. Do you sync back, or delete and
copy back? Our decision was to delete and copy back, ensuring the original site is clean and up to date.
Another realization was around the configuration of GSLB and the failed site. Since preparing the fail
over site for access requires manual intervention, there is potential for GSLB to re-direct users to the fail
over site before it is ready, users could hit a StoreFront before any personal desktops or applications are
available for them, they would have access to any common applications or desktops.
We used two different SQL approaches, Always-on for our infrastructure environment and clustering for
our data base application. This was done by design in the lab to show issues and considerations around
both.
To support high availability between the two main regions and having a third region for total failover the
one thing that our company president was less than thrilled with was the Cap-Ex cost of hardware not
being fully utilized. This is a cost of doing business.
However, with the recent introduction of Citrix Workspace Cloud, an alternate may have come up that we
are reworking our fictitious company toward. Rather than having additional hardware in Regions 1 and
2, what if there was a cloud site running at a minimum waiting for a region to fail, and spin up what is
needed to support the failure? Essentially, what is needed in the cloud is a NetScaler VPX for
connectivity, an AD server, a SQL Always on server, and an Exchange server. This keeps the mission
critical and business critical environments in sync. You can then determine what else may be required to
support each region. The one current caveat of the cloud is that currently no cloud supports desktop
operating systems; VDI users get server operating systems running in a desktop mode. This is not a
major issue for pooled VDI users, but does become something to be solved for dedicated VDI users.
Will the cloud work for you? Should you use additional hardware in your regions? What are your recovery
times? How much of your environment is actually mission critical? These are questions we hope you are
now considering as you build a disaster recovery plan for your company.

citrix.com

27

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Section 5: Appendices
Appendix A
References
EMC Storage
http://www.emc.com/en-us/storage/storage.htm?nav=1
Brocade Storage Network
http://www.brocade.com/en/products-services/storage-networking/fibre-channel.html
XenApp
http://www.citrix.com/products/xenapp/overview.html
XenDesktop
http://www.citrix.com/products/xendesktop/overview.html
NetScaler
http://www.citrix.com/products/netscaler-application-delivery-controller/overview.html
CloudBridge
http://www.citrix.com/products/cloudbridge/overview.html
Citrix CloudBridge Data Sheet:
https://www.citrix.com/content/dam/citrix/en_us/documents/products-solutions/cloudbridge-data-sheet.pdf

citrix.com

28

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Appendix B
High Level Regional Diagrams

citrix.com

29

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

citrix.com

30

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Appendix C
Identifying Services and Applications for DR/HA
This section identifies all the applications, services and data items for planning within our setup.

Call Center
Type: Database and App
Description: Main application for call center activity required for company mission critical function
Level: Mission Critical
Primary Location: Region 2 (West Coast), Region 1, R3/DR in case of failover or disaster
Access Methods:

Local Web Browser

Published App Web Browser

Data: SQL Database

actual test database - Microsoft SQL Sample StoreFront

Data Location: SQL 2014 Cluster


Systems:

SQL 2014 Database servers

Web Servers

Notes:

Database servers and database must be made accessible in R1 and R3/DR in case of fail-over or
disaster

Both database and web site for it would need to be created

http://businessimpactinc.com/install-northwind-database/

https://msdn.microsoft.com/en-us/library/vstudio/tw738475%28v=vs.100%29.aspx

Exchange
Type: Service
Description: Email service, required for internal and external communication
Level: Business Critical
Primary Location: Region 1 & 2, R3/DR in case of disaster

Some Exchange databases are region specific

Access Methods:

Local Outlook Application

Published Outlook Application

Web Outlook

Data: Exchange Databases


citrix.com

31

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Data Location: Exchange Servers
Systems:

Exchange Mailbox Servers

Exchange Client Access Servers

Notes

Exchange will need to be accessible in DR scenario in R3/DR for mission-critical users

Microsoft Office
Type: Application
Description: Productivity applications for regular office work
Level:

Outlook - Business Critical

other office apps - Productivity

Primary Location: Region 1 & 2, R3/DR in case of disaster


Access Methods:

Local Outlook Application

Published Outlook Application

Web Outlook

Data:

Outlook Data File

Outlook Address Book

Exchange Mailbox

Exchange Address Book

Data Location:

Exchange Servers

User Outlook file location (redirected from My Documents to UPM storage?)

Systems:

Exchange Mailbox Servers

Exchange Client Access Servers

Notes:

Outlook needs to be available in all regions in case of failover for business critical users.

XenDesktop
Type: Service
Description: Virtual Desktop Brokering and management system, required for virtual desktop access and
assignment
Level: Mission Critical
citrix.com

32

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Primary Location: Region 1 & 2, R3/DR in case of disaster
Data: XD Site Databases, region specific.
Data Location: SQL Always On HA Group
Systems:

XD Delivery Broker Server VMs

Citrix Licensing Server VMs

Notes:

Must be available in all regions for mission- and business-critical users to be able to access
desktops.

For R3/DR the XenDesktop database and SQL servers supporting it are required to be brought
up before the XD Deliver Controllers

Licensing server must be available for XenDesktop functionality to allow user connections

StoreFront
Type: Service
Description: Web Portal into the XenDesktop environment, required for user session access
Level: Mission Critical
Primary Location: Region 1 & 2, R3/DR in case of disaster
Access Methods: Web Browser, Citrix Receiver
Data: SF configuration
Data Location: SF servers
Systems: Storefront Server VMs
Notes:

Must be available in all regions for mission- and business-critical users to be able to access
desktops.

Provisioning Services
Type: Service
Description: Virtual Desktop VM streaming and deployment system, required for the virtual desktop VMs
launch
Level: Mission Critical
Primary Location: Region 1 & 2, R3/DR in case of disaster
Access Methods: PXE and DHCP for the Virtual Desktop VMs
Data:

PVS Farm Databases

vDisks

Data Location:

Farm Database - SQL Always On HA Group

citrix.com

33

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

vDisks File Servers

Systems:

PVS Server VMs

File Servers (for vDisks)

Notes:

Licensing server must be available for PVS functionality to allow virtual desktop launch

User Profiles
Type: Data
Description: User data required for all users work on virtual desktops
Level: Mission Critical
Primary Location: Region 1 & 2, R3/DR in case of disaster
Access Methods: SMB
Data: User personal data, including redirected My Documents
Data Location: UPM File Servers
Systems: File Server VMs

citrix.com

34

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery

Corporate Headquarters

India Development Center

Latin America Headquarters

Fort Lauderdale, FL, USA

Bangalore, India

Coral Gables, FL, USA

Online Division Headquarters

UK Development Center

Santa Barbara, CA, USA

Chalfont, United Kingdom

Silicon Valley Headquarters


Santa Clara, CA, USA
EMEA Headquarters
Schaffhausen, Switzerland

Pacific Headquarters
Hong Kong, China

About Citrix
Citrix (NASDAQ:CTXS) is leading the transition to software-defining the workplace, uniting virtualization, mobility management, networking
and SaaS solutions to enable new ways for businesses and people to work better. Citrix solutions power business mobility through secure,
mobile workspaces that provide people with instant access to apps, desktops, data and communications on any device, over any network
and cloud. With annual revenue in 2014 of $3.14 billion, Citrix solutions are in use at more than 330,000 organizations and by over 100
million users globally. Learn more at www.citrix.com

citrix.com

35

Design Considerations for Citrix XenApp/XenDesktop 7.6 Disaster Recovery


Copyright 2015 Citrix Systems, Inc. All rights reserved. XenApp, XenDesktop, XenServer, CloudBridge, and NetScaler are trademarks of
Citrix Systems, Inc. and/or one of its subsidiaries, and may be registered in the U.S. and other countries. Other product and company names
mentioned herein may be trademarks of their respective companies.