Beruflich Dokumente
Kultur Dokumente
August 2016
Version 3.0a
Prepared by
TBD
TBD
Table of Contents
1 Introduction ............................................................................................................................................ 5
1.1 Purpose............................................................................................................................................................. 5
SIPOC .............................................................................................................. 13
iii
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
TBD
4 Summary ................................................................................................................................................ 30
iv
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
1 Introduction
1.1 Purpose
This guide is intended to provide and introduction to the concepts relating to Business
Continuity and Disaster Recovery (BC/DR).
Page 5
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
2 Business Continuity and Disaster Recovery
Concepts
Disaster events can be categorized into two types forecasted and un-forecasted. A forecasted
event is one where the impact can be foreseen (such as a weather system event like a hurricane)
and can be mitigated through prior planning. Un-forecasted events are those where the
organization does not have a mitigation plan in place either due to the immediate timing of the
event itself (such as an earthquake or cyber security attack) or the realization of previously
accepted risk factors.
Disaster scenarios, major attack vectors or incident types are the events that could lead to a
major disruptions or crisis/emergency for the business. Organizations will identify specific
forecasted threats and their probabilities and impact in the organizations Risk Assessment (RA)
and Business Impact Assessment (BIA). Most of these assessments focus on forecasted risks to
the company and/or specific organization unit operations. Strategically, its important to look at
the overall impact of the scenario on the organization. Disaster Management is often divided
into the manageable areas to manage risk, plan and react to forecasted and un-forecasted
disaster events.
Page 6
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
2.2 Enterprise Risk Management (ERM)
Often found in the organizations finance group, this team forecasts potential threats to the
business for the board of the directors and shareholders. Enterprise Risk Management (ERM)
looks at competitive threats, natural and manmade threats, regulatory changes and government
and market changes. The ERM teams primary purpose is to map out the forecasted impact of
strategic mistakes. This forecasting process requires due diligence and can take up significant
amount of time and energy. When analyzing disasters, the primary goal is to understand how
much damage (money, assets, and destroyed supply chains) the organization can withstand.
ERM has its roots from insurance, loss control and compliance1. Common risk areas include:
1
Additional information can be found in the Business Continuity Institutes Good Practice Guidelines
(http://www.thebci.org/index.php/resources/the-good-practice-guidelines) and the MIT Sloan School of business
(http://sloanreview.mit.edu/tag/risk-identification)
Page 7
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Figure 2: Business Continuity Lifecycle
Business Continuity Policy and Charter - Most organizations will have a policy stating
the strategy and executive support in the times of disaster.
Risk Assessments These identify and analyze potential risks and threats to the overall
organizations performance before a disaster event is realized.
Business Impact Analysis - This determines the impact of specific disasters on specific
operational functions. This is commonly defined as systematic, repeatable and
substantially defensible analysis to identify, measure, and validate potential impacts an
interruption would cause to a business process.
Continuity Requirements This determines specific continuity performance metrics for
specific supply chains, systems and processes including desired recovery time objectives
and recovery point objectives.
Page 8
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Business Continuity Standards
While there is a significant set of regulations and laws pertaining to the continuity and resiliency
for governments and publicly traded companies in different countries, there is also a growing
list of accepted international standards from the International Organization for Standardization
(ISO): ISO 22301 (standard) and ISO 22313 (implementation). Note that ISO 22301 and 22313
should be viewed as a minimum bar and not a final goal of organization. Even if an organization
passes an ISO 22301 audit, doesnt mean they have an effective business continuity or disaster
recovery capability. Other common standards and regulations which cover business continuity
and disaster recovery include:
Standard Purpose
ISO 14001 Environmental management systems - Requirements with guidance for use
ISO/PAS 22399 Societal security - Guideline for incident preparedness and operational continuity management
ISO/IEC 24762 Information technology Security techniques and guidelines for Information and communications
technology disaster recovery services
ISO/IEC 27031 Information technology Security techniques Guidelines for information and communication
technology readiness for business continuity
BS 25999-1 Business continuity management Code of practice, British Standards Institution (BSI)
SI 24001 Security and continuity management systems Requirements and guidance for use, Standards
Institution of Israel
NFPA 1600 Standard on disaster/emergency management and business continuity programs, National Fire
Protection Association (USA)
Page 9
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Standard Purpose
Business Continuity Central Disaster Management Council, Cabinet Office, Government of Japan, 2005
Guideline
ANSI/ASIS SPC.1 Organizational Resilience: Security, Preparedness, and Continuity Managements Systems Requirements
with Guidance for Use SS 540: 2008, Singapore Standard for Business Continuity Management
ANSI/ASIS/BSI BCM.01 Business Continuity Management Systems: Requirements with Guidance for Use
When starting a BC/DR oriented project, its crucial to utilize the organizations Business Impact
Assessment (BIA) and Risk Assessment (RA) to define needs in responding to a disaster. These
needs will help the organization define their disaster response strategy from an IT perspective.
Often times the IT organization will have the RA/BIA and Continuity requirements (CR)
documented or will know crucial business services and the corresponding dependent IT systems.
It is important that the disaster recovery plan align to the RA/BIA and Continuity Requirements
(CR) of the organization. The DR plan should be tightly scoped to the targeted services and
supply chains and forecast the impact on adjacent dependent IT assets and processes.
Page 10
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
It is important to note that the IT organization must in most cases own its own RA, BIA and CR.
While a BC/DR project can present options, observations, the engagement cannot represent
itself as owning the final strategy or guaranteeing risk scenarios to the customer (board of
directors, shareholders, or political leaders). Its generally regarded as a bad practice for
customers to delegate their business continuity strategic decisions to a third party. Promoting a
disaster recovery plan without regard to the customers business continuity requirements,
customers emergency implementation capability or the critical dependences (people, processes
and IT systems) is generally regarded to be an irresponsible action.
Establish
PLAN
Stakeholders
driving Stakeholders
Review and Improve Implement & Operation
Requirements, Realize the
ACT DO
Vision and results
Direction
PDCA is an essential top-down approach to help make sure business continuity strategies are
aligned with executive needs of the organizations. However, it must be complimented with a
bottoms-up capability perspective to help make sure that the strategy can be implemented by
Page 11
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
the disaster response team. When DR plans are designed from the top down without regard to
the response capabilities of the organization, the ability to run under pressure with limited staff
in a disaster is often compromised. Often, the tools, training and muscle memory of the teams
will determine if the organization effectively recovers with the disaster recovery plan.
The Recovery Point Objective (RPO) covers the maximum amount (in time) of data that can be
lost in case of a disruption. It answers the question, to what point in time can I recover?
The Recovery Time Objective (RTO) covers the maximum amount of time it will take from the
disruption to bring back the business functions including data. It answers the question, at what
point in time can I expect business operations to continue?
The RPO and RTO figures we find most in the Service Level Agreement (SLA) are focused on the
regular back-up and recovery processes. As part of Disaster Recovery (DR) the RPO and RTO
figures normally would be higher. Based on the Business Continuity (BC) plan realistic values
need to be set. As seen in the figure below ideally RPO and RTO are business driven numbers,
that rollup from the RPO and RTO for the technical components that make up the business
application.
Figure 5: RPO/RTO
Page 12
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
SIPOC
Most business continuity teams look at the organization as a whole and work on a resiliency
strategy that incorporates restoring the specific business operation as a whole. The SIPOC
acronym (a Six Sigma methodology) stands for supplier, inputs, processes, outputs, and
customers. This model is often used to assist groups in understanding the interrelationships of
their processes and how work is currently performed within each process.
Dependency Categories
There is a common terminology used when outlining business dependencies (non-IT assets)
during BC/DR analysis. It is critical to understand how technology can impact each of the areas
listed below and familiarity with these terms helps with BC/DR planning to address the needs of
the organizations business. These terms include, but are not limited to:
A Supplier is any person, entity or organization that provides inputs to the current
process. A supplier can provide information, data, documents, guidelines, transactions,
supplies, equipment or raw material. An internal supplier is internal to the organization,
such as a team or business group, and provides inputs for the process in question. An
external supplier is an external entity or organization providing inputs to the process.
An Input is anything which feeds into the process as a document, guidelines, product,
data, transaction, specialized equipment or raw material.
Workforce, for the purpose of the Dependency Analysis, is any employee or non-payroll
worker. These would include vendors and independent contractors.
A Location is a place where something is or could be located; a site, such as a specific
building name or number.
An Application refers to a computer program or group of programs designed for end
users. Applications are self-contained programs that perform a well-defined set of tasks
under user control.
A Vendor is a business entity contracted to provide a service or infrastructure element to
customers or clients. Vendors can be any third-party provider, regardless of the service
they provide.
Data and Vital Records refer to any data or information required to perform your
process. Data and vital records can be electronic or hard copy and reside in a number of
different formats or locations.
Specialized Equipment is any specialized equipment, machine or tool required to
perform your process. Your list of equipment and tools should not include normal office
equipment and supplies such as laptops, PCs, printers, copiers, fax machines, paper, pens
and general desk supplies.
Page 13
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
A Partnership is a formal contractual relationship established to provide regular
business services between two companies.
An Output is anything that was created by the process such as a document, transaction,
product, data or information and given to a customer of the process. An output of your
process can be an input of another related process. Example: Customer Balance can
be an input to a Collections Process. An output can be given to an internal customer,
external customer, or a related process.
A Customer is any team, business unit or organization that receives a product or service
from your process. Internal customers are colleagues, departments or groups inside the
organization who receive products, services, support, or information from your process.
External customers are individuals or organizations outside of the organization who are
usually associated with paying money for our products and services, or are an extension
of your process under a contractual relationship.
Technical Dependency Analysis (TDA) is a process to define all technology and processes
components and key personnel to keep a specific IT capability operational. Common TDA
questions asked by the Business Continuity Team for each critical system include:
1.
Recovery Time Capability (RTC), or
Application & Infrastructure Support Recovery Time Estimate (RTE), if they havent been tested
Team Recovery Point Capability (RPC), or
Recovery Point Estimate (RPE), if they havent been tested
Identify the Primary Production Site
Identify the Failover Site (if exists)
Identify the dependent applications to the primary application
Identify critical systems that are dependent on the primary application
Identify all Single Points of Failure (SPOF) for the primary application
Has Disaster Recovery (DR) been implemented (not backups)?
Is the Disaster Recovery Plan (DRP) Available?
Has the Disaster Recovery Plan (DRP) been tested?
What is the last Disaster Recovery Plan (DRP) test date?
Page 14
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
The Technical Dependency Analysis (TDA) examines the application(s) and supporting
infrastructure that a process depends on to determine, at a minimum, the following:
The data collected from this analysis will be used to identify gaps between the business process
recovery requirements and the recovery capabilities of the applications and supporting
infrastructure.
Recovery Time Capability means the technical dependency has been proven through a
test and may or may not meet the RTO requirement.
Recovery Time Estimate means the technical dependency has not been proven through
a test and the RTC has not been validated.
Recovery Point Capability means the technical dependency has been proven through a
test and may or may not meet the RPO requirement.
Recovery Point Estimate means the technical dependency has not been proven
through a test and the RPC has not been validated.
Performing the due diligence of TDA will lead towards the development of service dependency
maps, which outline the dependent systems and services for each application providing
capability to an organizations specific business functions.
1. Kick-off and positioning: Provides all attendees of the program detailed information on
the goals, planning and their roles, next to a common language when talking about
disaster recovery.
2. Service Mapping: Collects all information about the business function or service, its
components, the parties involved and the different agreements. Insights gathered in this
session are essential in planning for the recovery from a disaster.
3. Scenario identification: Identifying all possible DR scenarios and ranking those on
probability, impact and mitigations already in place. Information from this session is used
to validate the coverage of the technical recovery scenarios and evaluate if the
information from the Business Impact Assessment is complete. Based on the identified
scenarios, the response and the corresponding processes are designed.
Page 15
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
4. Information Needs and War room Facilities: Based on the identified recovery
scenarios and their constraints, information needs and facility requirements are identified
(email not available, no phone, no access to the office, etc.). This area is operated by the
Response Leadership Team. It will be crucial to design a strategy that the response can
easily run. Too much complexity and manual processes is the root of most Response
failures.
5. Responsibility and accountability: Based on the prior workshops a RACI matrix will be
set up for extending / maintaining the DRP and running a Disaster Recovery. Further
Critical Success Factors (CSF) and Key Performance Indicators (KPI) will be set up to
measure and extend the BC/DR.
While the information gathered and documented is valuable, it is also quite volatile.
Embedding disaster recovery in the change process will allow the organization update the
DR information as part of changes that are implemented. It is a good practice to assign an
owner to the information and set relevant review intervals to verify that the information is
up-to date.
Needless to say, when a real disaster hits, its the emergency response team that will mobilize
with the executive leadership team to stabilize the organization, its people, process, partners
and customers in their time of need.
Often BC/DR plans ignore the implementation capabilities of the emergency team and assume
the organization will have access to their most talented technical team to fix or restore critical
systems. To use a practical example, the most skilled Active Directory administrative team will
effectively restore the Active Directory service for the company after a disaster has occurred. IT
leadership often assumes the incident response team that addresses common outages can
manage a major disaster. These assumptions are often ill founded in a real disaster. As a
guideline, it is safe to assume an IT organization will have access to 50% of their employees
operating at 50% mental capacity under stress. As a general practice, BC/DR plans should
incorporate this assumption in the recovery capabilities of the organization.
Page 16
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
For a variety of reasons, often regular organizational bureaucracy is ill equipped to handle the
pressure and rapid pace massive disaster management requires. Emergency Response requires
a clear command model with focused teams to quickly rebuild the organizations systems and
services effectively. While there are variety of approaches, most successful Emergency Response
teams utilize a variation of the Incident Command System (ICS). ICS is an internationally
recognized operational command and control model to mobilize, access and triage the crisis
and incorporate and responsibly orchestrate all available talent available while working with
critical partners, government organizations and key stakeholders. ICS as a process has been
maturing and proving itself for decades. The reason why organizations use ICS: it consistently
works.
Page 17
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
An example of IT Incident Command Systems (ICS) model for Disaster Management is provided
in the diagram below:
Executive
Leadership Team
These technology response teams will support the Operations Lead on target missions to
support restoring key organization functions and supply chains. The Operations Lead directs all
response/tactical actions. For government organizations, the Operations will lead a variety of
responsibilities working with other ICS leadership.
Typical IT restoration will consist of multiple teams in a separate IT Recovery Team or injected
into a task force unit. Needs assessments will be triaged to focus on the most important
systems first. IT restoration encompasses two types of missions:
It will be important for the DR strategy be automated and standardized as much as possible to
help these teams be successful. Often, the IT Recovery Team is managing hundreds of separate
uncoordinated DR missions in a major disaster. As a recommended practice, when evaluating
Page 18
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
the technical dependencies of an organization, examine the organizations incident response
capabilities carefully to help make sure they can complete the DR plan developed.
Page 19
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
3 Microsoft Cloud-Based Disaster Recovery
Capabilities
Cloud infrastructures provide availability constructs such as upgrade domains which define
boundaries of failure. These boundaries differ based on public or private cloud offerings and
each models capabilities are outlined in the sections below.
Virtual machine mobility for on-premises infrastructures running Windows Server 2012 R2
Hyper-V is supported by two technologies: Hyper-V Live Migration and Hyper-V Storage
Migration.
Hyper-V Live Migration makes it possible to move running virtual machines from one physical
host to another with no effect on the availability of virtual machines to the services running
within it. Hyper-V Live Migration is divided into two categories:
Shared Storage-based live migration. In this instance, the hard disk of each virtual
machine is stored on either a local CSV or a central SMB file share and live migration
occurs over either TCP/IP or the SMB transport. You then perform a live migration of the
virtual machines from one server to another while their storage remains on the central
local CSV or SMB share.
Page 20
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Shared-nothing live migration. In this case, the live migration of a virtual machine
from one non-clustered Hyper-V host to another begins when the hard drive storage of
the virtual machine is mirrored to the destination server over the network. Then you
perform the live migration of the virtual machine to the destination server while it
continues to run and provide network services.
Windows Server 2012 R2 also supports Live storage migration, which supports the movement of
virtual hard disks that are attached to a virtual machine that is running. This provides the
flexibility to manage storage without affecting the availability of virtual machine workloads,
perform maintenance on storage subsystems, upgrade storage-appliance firmware and
software, and balance loads while the virtual machine is in use. Live storage migration is
supported for virtual hard disks on shared and non-shared storage subsystems (when using
Hyper-V over SMB designs).
Virtual machine replication for on-premises infrastructures is supported by the Windows Server
2012 R2 Hyper-V Replica feature. Hyper-V Replica provides a workload agnostic failure recovery
solution by providing asynchronous replication of virtual machines over standard network
protocols (HTTP or HTTPS) from one Hyper-V host or cluster to another remote Hyper-V host or
cluster without relying on storage arrays or other software replication technologies. Windows
Server 2012 R2 Hyper-V Replica supports replication between source and target Hyper-V servers
(or clusters) which can be physically co-located or geographically separated. It can further
support extending replication from the target server to a third server through the extended
replication feature. Hyper-V Replica tracks the write operations on the primary virtual machine
and replicates these changes to the replica server in configurable frequencies of 15 minutes, 5
minutes or 30 seconds and additional recovery points can be configured to be stored for 24
hours. Hyper-V Replica also supports both planned and unplanned failover scenarios with
advanced logic such as TCP/IP re-addressing of the host as part of the failover process.
Virtual machine backup for on-premises infrastructures is provided through backup software
which supports the Hyper-V Volume Shadow Copy Services (VSS) Writer. The ability to back up
open files is required to provide business continuity and VSS creates frozen copies of open files,
helping to make sure that virtual machines do not have to be put into hibernation or be shut
down before a consistent backup can be made. In a virtualized data center, there are three
commonly used backup types: host-based, guest-based, and a SAN-based snapshot. The
following table contrasts these types.
Page 21
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Backup Capability Host-Based Guest-Based SAN Snapshot
Note that the use of SAN volume snapshots is highly dependent on the storage vendors level of
VSS and Hyper-V integration. SAN volume snapshots are typically block-level, and they only
utilize storage capacity as blocks change on the originating volume.
System Center 2012 R2 Data Protection Manager allows disk-based and tape-based data
protection and recovery for Hyper-V servers. Data Protection Manager supports the protection
of standalone or clustered computers running Hyper-V in failover clusters using shared (cluster
shared volumes) or SMB storage.
Azure Site Recovery (ASR) On-Premises BCDR for physical instances including MSCS clusters,
virtual instances running on VMWare and Pre-2012 Hyper-V is available using ASR. ASR enables
your organization to meet stringent disaster recovery needs, eliminate the impact of local
backups, and manage application uptime to meet high availability requirements. ASR uses
advanced technologies like Continuous Data Protection (CDP), Asynchronous Replication over
IP, Application Failover/Failback, and WAN Optimization for disaster recovery of data.
CDP technology enables ASR to capture data for recovery purposes and lets you decide upon
any recovery point in time to recover your lost/corrupted data.
A key consideration when deploying workloads to hybrid cloud environments is that the
organization is mixing availability constructs between what they provide internally through on-
premises cloud infrastructures and what the public cloud provider has exposed through various
service offerings. This mixing of constructs means that BC/DR planning of a workload which
spans public and private cloud infrastructures must consider both the capabilities and SLAs
provided by both environments to assess availability and recovery needs.
Azure provides a wide range of capabilities which support the availability of workloads spanning
on-premises and Public Cloud Infrastructure as a Service (IaaS)-based solutions. These
capabilities change rapidly with each new release and an overview of currently available IaaS
services is provided below.
First, availability of workloads hosted in Azure virtual machines is achieved by using multiple
virtual machines for continuity. This provides general availability of the workload during local
network failures, local disk-hardware failures, and any planned downtime that the platform
might require. Availability of a workload comprised of multiple virtual machines is achieved by
Page 23
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
adding them to an availability set. Availability sets are directly related to fault domains and
update domains in cloud infrastructures. A fault domain in Azure is defined by avoiding single
points of failure, like the network switch or power unit of a rack of servers. When multiple virtual
machines are connected together in a cloud service, an availability set can be used to help help
make sure that the virtual machines are located in different fault domains. The following
diagram shows two availability sets, each of which contains two virtual machines.
Azure periodically updates the underlying infrastructure that hosts the instances of running
workloads and during that process a virtual machine is shut down when an update is applied. An
update domain is used to help make sure that not all of the virtual machine instances are
updated at the same time. When you assign multiple virtual machines to an availability set,
Azure helps to make sure that the virtual machines are assigned to different update domains.
As discussed previously, the Windows Azure virtual machine availability concepts are not the
same as on-premises Hyper-V. To support high availability for workloads hosted in Azure,
multiple virtual machines per application or role must be created, and Azure constructs such as
availability groups and load balancing must be utilized. Additional information about these
constructs can be found in the Infrastructure-as-a-Service Product Line Architecture Fabric
Architecture Guide.
The Azure Site Recovery Service contributes to your business continuity and disaster recovery
(BCDR) strategy by orchestrating replication, failover and recovery of virtual machines and
physical servers. Machines can be replicated to Azure, or to a secondary on-premises datacenter.
Page 24
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Azure Site Recovery Service is a hybrid cloud service which coordinates and manages the
protection of VMWare virtual machines located in private cloud infrastructures managed by
VMWare ESX servers. Azure Site Recovery Service orchestrates failover of these virtual machines
from one on-premises ESX host or cluster to another on-premises ESX host or cluster located in
secondary location.
Azure Site Recovery Service uses the concept of vaults in Azure to store configuration data
related to the protection of single and multi-tier workloads which are defined as Recovery Plans.
See the following configuration example.
Recovery Plans are linear orchestration plans which allow for the grouping of virtual machines
into one or more failover groups. Recovery plans also allow for the addition of manual steps
and the insertion of automation (scripts) which can be run as part of a failover event. When
combined together, Recovery Plans support many of the requirements for failover multi-tier
application of workloads that span multiple virtual machines. While there are many
technologies available which provide protection of virtual machines themselves, very few
recovery solutions exist which provide the fabric management infrastructure with the
intelligence to see multiple virtual machines as composed applications and services with
differing failover needs and actions for each tier.
Page 25
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
Azure Site Recovery Service additionally automates planned and unplanned failover activities
across sites and supports the TCP/IP readdressing needs when failover is performed across
separate network segments. Finally, recovery plans can be tested in isolation without disruption
to the running workload, supporting activities such organizational BC/DR drills and plan
verification.
Configuration Server Deploy as an Azure standard A3 virtual This server coordinates communication
machine in the same subscription as Site between protected machines, the
Recovery. Process Server, and Master Target
servers in Azure. It sets up replication
You set up this server in the Azure Site
and coordinates recovery in Azure when
Recovery portal
failover occurs.
Master Target Server Deploy as Azure virtual machine It receives and retains replicated data
Either a Windows server based on a from your protected machines using
Windows Server 2012 R2 gallery image attached VHDs created on blob storage
(to protect Windows machines) or as a in your Azure storage account.
Linux server based on a OpenLogic
Page 26
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
COMPONENT DEPLOYMENT DETAILS
On-Premises Machines On-premises virtual machines running You set up replication settings that apply
on a VMware hypervisor, or physical to virtual machines and servers. You can
servers running Windows or Linux. fail over an individual machine or more
commonly, as part of a recovery plan
containing multiple virtual machines that
fail over together.
Mobility Service Installs on each virtual machine or The service takes a VSS snapshot of data
physical server you want to protect on each protected machine and moves it
to the Process Server, which in turn
Can be installed manually or pushed and
replicates it to the Master Target server.
installed automation by the Process
Server.
Azure Site Recovery Vault Set up after you've subscribed to the Site You register servers in a Site Recovery
Recovery service. vault. The vault coordinates and
orchestrates data replication, failover,
and recovery between your on-premises
site and Azure.
Replication Mechanism Over the InternetCommunicates and Neither option requires you to open any
replicates data from protected on- inbound network ports on protected
premises servers and Azure using a
secure SSL/TLS communication channel
Page 27
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
COMPONENT DEPLOYMENT DETAILS
VPN/ExpressRouteCommunicates
and replicates data between on-
premises servers and Azure over a VPN
connection. You'll need to set up a site-
to-site VPN or an ExpressRoute
connection between the on-premises
site and your Azure network.
FEATURE REFERENCE
Page 28
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
capabilities to further enhance their availability. Examples of this include SQL Server Always-On
Availability Groups support in Azure and enhancements to AD DS support in virtualized
environments. As outlined in the BC/DR concepts, it is important to include workload availability
constructs as Recovery Point Capabilities when determining RTO/RPO for cloud-based solutions.
Page 29
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"
4 Summary
Business Continuity and Disaster Recovery planning is a required element of any cloud-based
workload deployment. This document is meant to serve as a framework for applying BC/DR
concepts in workload planning and design for public, private and hybrid cloud environments.
These concepts and capabilities can be applied to various applications and services and
therefore require analysis of each workloads capabilities and support for the constructs
discussed earlier. Along with this guide, a series of workload-specific scenario guides are
available to outline practical application by various workloads for the framework outlined in this
document.
Page 30
Business Continuity and Disaster Recovery Overview, , Version 3.0a
Prepared by
"Document1"