Business Continuity Planning Process

Section 3 : Business Continuity
Introduction to Business
Continuity
Chapter Objective
After completing this chapter, you will be able to:
• Define Business Continuity and Information Availability
• Detail impact of information unavailability
• Define BC measurement and terminologies
• Describe BC planning process
• Detail BC technology solutions
What is Business Continuity (BC)
• Business Continuity is preparing for, responding to, and recovering

from an application outage that adversely affects business operations
• Business Continuity solutions address unavailability and degraded
application performance
• Business Continuity is an integrated and enterprise wide process and
set of activities to ensure “information availability”
What is Information Availability (IA)
• IA refers to the ability of an infrastructure to function according to

business expectations during its specified time of operation
• IA can be defined in terms of three parameters:
• Reliability
• The components delivering the information should be able to function without failure,
under stated conditions, for a specified amount of time
• Accessibility
• Information should be accessible at the right place and to the right user
• Timeliness
• Information must be available whenever required
Causes of Information Unavailability
Disaster (<1% of Occurrences)

Natural or man made
Flood, fire, earthquake
Contaminated building
Unplanned Outages (20%)

Failure
Database corruption
Component failure
Human error
Planned Outages (80%)

Competing workloads
Backup, reporting
Data warehouse extracts
Application and data restore
Impact of Downtime
Lost Productivity Know the downtime costs (per Lost Revenue
• Number of employees hour, day, two days...) • Direct loss
impacted (x hours out * • Compensatory payments
hourly rate) • Lost future revenue
• Billing losses
• Investment losses
Damaged Reputation Financial Performance
• Customers • Revenue recognition

• Suppliers • Cash flow
• Financial markets • Lost discounts (A/P)
• Banks • Payment guarantees
• Business partners • Credit rating
• Stock price
Other Expenses
Temporary employees, equipment rental, overtime
costs, extra shipping costs, travel expenses...
Impact of Downtime
• Average cost of downtime per hour = average productivity loss per

hour + average revenue loss per hour
• Where:
• Productivity loss per hour = (total salaries and benefits of all
employees per week) / (average number of working hours per week)
• Average revenue loss per hour = (total revenue of an organization per
week) / (average number of hours per week that an organizations is
open for business)
Measuring Information Availability
MTTR – Time to repair or ‘downtime’
Response Time Recovery Time
Detection Repair Restoration
Time
Incident Diagnosis Recovery Incident
Detection Repair time MTBF – Time between

elapsed failures or ‘uptime’
time
• MTBF: Average time available for a system or component to perform its

normal operations between failures
• MTTR: Average time required to repair a failed component
IA = MTBF / (MTBF + MTTR) or IA = uptime / (uptime + downtime)
Availability Measurement – Levels of ‘9s’ Availability
% Uptime % Downtime Downtime per Year Downtime per Week
98% 2% 7.3 days 3hrs 22 min
99% 1% 3.65 days 1 hr 41 min
99.8% 0.2% 17 hrs 31 min 20 min 10 sec
99.9% 0.1% 8 hrs 45 min 10 min 5 sec
99.99% 0.01% 52.5 min 1 min
99.999% 0.001% 5.25 min 6 sec
99.9999% 0.0001% 31.5 sec 0.6 sec

BC Terminologies
• Disaster recovery
• Coordinated process of restoring systems, data, and infrastructure required
to support ongoing business operations in the event of a disaster
• Restoring previous copy of data and applying logs to that copy to bring it to a
known point of consistency
• Generally implies use of backup technology
• Disaster restart
• Process of restarting from disaster using mirrored consistent copies of data
and applications
• Generally implies use of replication technologies
BC Terminologies (Cont.)
Recovery Point Objective (RPO) Recovery Time Objective (RTO)
• Point in time to which systems and • Time within which systems,
data must be recovered after an applications, or functions must be
outage recovered after an outage
• Amount of data loss that a business • Amount of downtime that a
can endure business can endure and survive
Weeks Weeks
Tape Backup Tape Restore
Days Days
Periodic Replication Disk Restore
Hours Hours
Asynchronous Replication Manual Migration
Minutes Minutes
Synchronous Replication Global Cluster

Seconds Seconds
Recovery-point objective Recovery-time objective

Business Continuity Planning (BCP) Process
• Identifying the critical business functions

• Collecting data on various business processes within those functions
• Business Impact Analysis (BIA)
• Risk Analysis
• Assessing, prioritizing, mitigating, and managing risk
• Designing and developing contingency plans and disaster recovery
plan (DR Plan)
• Testing, training and maintenance
Business Continuity (BC) Planning Lifecycle
BC planning must follow a disciplined approach like any other planning

process. Organizations today dedicate specialized resources to develop and
maintain BC plans. From the conceptualization to the realization of the BC
plan, a lifecycle of activities can be defined for the BC process. The BC planning
lifecycle includes five stages:
1. Establishing objectives
2. Analyzing
3. Designing and developing
4. Implementing
5. Training, testing, assessing, and maintaining
Business Continuity (BC) Planning Lifecycle
Figure. BC planning lifecycle

Establishing objectives
• Determine BC requirements.
• Estimate the scope and budget to achieve requirements.
• Select a BC team by considering subject matter experts from all areas of the
business, whether internal or external.
• Create BC policies.
Analyzing
• Collect information on data profiles, business processes, infrastructure
support, dependencies, and frequency of using business infrastructure.
• Identify critical business needs and assign recovery priorities.
• Create a risk analysis for critical areas and mitigation strategies.
• Conduct a Business Impact Analysis (BIA).
• Create a cost and benefit analysis based on the consequences of data
unavailability.
• Evaluate options.
Designing and developing
• Define the team structure and assign individual roles and responsibilities. For
example, different teams are formed for activities such as emergency
response, damage assessment, and infrastructure and application recovery.
• Design data protection strategies and develop infrastructure.
• Develop contingency scenarios.
• Develop emergency response procedures.
• Detail recovery and restart procedures.
Implementing
• Implement risk management and mitigation procedures that include backup,
replication, and management of resources.
• Prepare the disaster recovery sites that can be utilized if a disaster affects
the primary data center.
• Implement redundancy for every resource in a data center to avoid single
points of failure.
Training, testing, assessing, and maintaining
• Train the employees who are responsible for backup and replication of
business-critical data on a regular basis or whenever there is a
modification in the BC plan.
• Train employees on emergency response procedures when disasters are
declared.
• Train the recovery team on recovery procedures based on contingency
scenarios.
• Perform damage assessment processes and review recovery plans.
• Test the BC plan regularly to evaluate its performance and identify its
limitations.
• Assess the performance reports and identify limitations.
• Update the BC plans and recovery/restart procedures to reflect regular
changes within the data center.
BC Technology Solutions
• The following are the solutions and supporting technologies that

enable business continuity and uninterrupted data availability:
• Fault tolerant configuration
• To avoid single-point of failure
• Multi-pathing software
• Backup and replication
• Backup recovery
• Local replication
• Remote replication
Implementation of Fault Tolerance
Clustered Servers
Redundant Arrays
Heartbeat Connection
Redundant Ports
Client
FC Switches
IP
Storage Array Storage Array
Remote Site
Redundant Network
Redundant Paths
Redundant FC Switches
Multi-pathing Software
• Configuration of multiple paths increases data availability

• Even with multiple paths, if a path fails I/O will not reroute unless
system recognizes that it has an alternate path
• Multi-pathing software helps to recognize and utilizes alternate I/O
path to data
• Multi-pathing software also provide the load balancing
• Load balancing improves I/O performance and data path utilization
Backup and Replication
• Local Replication
• Data from the production devices is copied to replica devices within the
same array
• The replicas can then be used for restore operations in the event of data
corruption or other events
• Remote Replication
• Data from the production devices is copied to replica devices on a remote
array
• In the event of a failure, applications can continue to run from the target
device
• Backup/Restore
• Backup to tape has been a predominant method to ensure business
continuity
• Frequency of backup is depend on RPO/RTO requirements
Chapter Summary
Key points covered in this chapter:

• Importance of Business Continuity
• Types of outages and their impact to businesses
• Information availability measurements
• Definitions of disaster recovery and restart, RPO and RTO
• Business Continuity technology solutions overview
 Check Your Knowledge
• Which concerns do business continuity solutions address?

• “Availability is expressed in terms of 9s.” Explain the relevance of the
use of 9s for availability, using examples.
• What is the difference between RPO and RTO?
• What is the difference between Disaster Recovery and Disaster
Restart?
• Provide examples of planned and unplanned downtime in the
context of storage infrastructure operations.
• What are some of the Single Points of Failure in a typical data center
environment?

Business Continuity Planning Process

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Business Continuity Planning Process

Hochgeladen von

Copyright:

Verfügbare Formate

Section 3 : Business Continuity

• Business Continuity is preparing for, responding to, and recovering

• IA refers to the ability of an infrastructure to function according to

Disaster (<1% of Occurrences)

Unplanned Outages (20%)

Planned Outages (80%)

Damaged Reputation Financial Performance

• Customers • Revenue recognition

• Average cost of downtime per hour = average productivity loss per

Response Time Recovery Time

Detection Repair Restoration

Detection Repair time MTBF – Time between

• MTBF: Average time available for a system or component to perform its

% Uptime % Downtime Downtime per Year Downtime per Week

98% 2% 7.3 days 3hrs 22 min

99% 1% 3.65 days 1 hr 41 min

99.8% 0.2% 17 hrs 31 min 20 min 10 sec

99.9% 0.1% 8 hrs 45 min 10 min 5 sec

99.99% 0.01% 52.5 min 1 min

99.999% 0.001% 5.25 min 6 sec

99.9999% 0.0001% 31.5 sec 0.6 sec

Synchronous Replication Global Cluster

Recovery-point objective Recovery-time objective

• Identifying the critical business functions

BC planning must follow a disciplined approach like any other planning

Figure. BC planning lifecycle

• The following are the solutions and supporting technologies that

Storage Array Storage Array

• Configuration of multiple paths increases data availability

Key points covered in this chapter:

• Which concerns do business continuity solutions address?

Das könnte Ihnen auch gefallen