Sie sind auf Seite 1von 7

1.

Project initiation

Setting the budget


Determining the team leader
Starting the BCP process

Determine the scope. A properly defined scope is of tremendous help in maximizing the
effectiveness of the BCP plan.

Team responsibilities:

a) Identifying regulatory and legal requirements that must be complied with


b) Identifying all possible threats and risks
c) Estimating the probability of these threats and correctly identifying their loss
potential
d) Performing a BIA
e) Outlining the priority in which departments, systems, and processes must be up and
running before any others
f) Developing the procedures and steps to resume business functions following a
disaster
g) Assigning tasks to the employee roles, or individuals, that will complete those tasks
during a crisis situation
h) Documenting plans, communicating plans to employees, and performing necessary
training and drills

2. Business impact assessment


3. Recovery strategy
4. Plan design and development
5. Implementation
6. Testing
7. Monitoring and maintenance

The 4 primary elements of BCP are:

 Scope plan initiation


 Business impact Analysis – includes vulnerability assessment
 Business continuity plan development
 Plan approval and implementation
Scope and Plan initiation:

Steps involved in the scope and plan initiation include creating an account of the work required,
listing the resources to be used and defining the management practices to be employed.

A BCP committee should be formed and given the responsibility to create, implement and test
the plan.

Business Impact Analysis:

The purpose of a BIA is to create a document to be used to help understand what impact a
disruptive event would have on the business.

The business impact analysis has 3 primary goals:

Criticality Prioritization: Critical business units must be identified and prioritized.

Downtime Escalation: Estimate the maximum tolerable downtime (MTD)

Resource Requirements: Identify resource requirements for the critical processes.

The five steps of the BIA process are

a) identification of priorities
b) risk identification
c) likelihood assessment
d) impact assessment
e) resource prioritization.

Contingency Planning

There is a general 6-step approach to contingency planning:

1. Identify critical business functions


2. Identify the resources and systems that support these critical functions.
3. Estimate potential disasters
4. Select planning strategies – how to recover the critical resources and evaluate
alternatives. A disaster recovery and contingency plan usually consists of emergency
response, recovery and resumption activities.
5. Implementing strategies.
6. Testing and revisiting the plan.
Plan Approval and Implementation:

Plan approval and implementation consists of:

1. Approval by senior management. (APPROVAL)

2. Creating an awareness of the plan enterprise-wide. (AWARENESS)

3. Maintenance of the plan, including updating when needed. (MAINTENANCE)

= IMPLEMENTATION!

Maximum tolerable downtime (MTD) or maximum tolerable outage (MTO): The


MTD is the maximum length of time a business function can be inoperable without
causing irreparable harm to the business.

recovery time objective (RTO): The amount of time required to recover the function.

Recovery Point Objective


The Recovery Point Objective (RPO) is the amount of data loss or system inaccessibility
(measured in time) that an organization can withstand. “If you perform weekly backups,
someone made a decision that your company could tolerate the loss of a week’s worth of data. If
backups are performed on Saturday evenings and a system fails on Saturday afternoon, you have
lost the entire week’s worth of data. This is the recovery point objective. In this case, the RPO is
1 week.”

Work Recovery Time (WRT) describes the time required to configure a recovered system.

MTD= RTO+WRT

Goal of BCP to ensure

RTO < MTD

The BIA is comprised of two processes. First, identification of critical assets must occur.
Second, a
comprehensive risk assessment is conducted.

Automated Call Trees


Automated call trees automatically contact all BCP/DRP team members after a disruptive event.
Third-party BCP/DRP service providers may provide this service.
Mean Time between Failures (MTBF) quantifies how long a new or repaired system will run
before failing.

Mean Time to Repair (MTTR)


The Mean Time to Repair (MTTR) describes how long it will take to recover a specific failed
system. It is the best estimate for reconstituting the IT system so that business continuity may
occur.

Minimum Operating Requirements


Minimum Operating Requirements (MOR) describes the minimum environmental and
connectivity requirements in order to operate computer equipment.

 A redundant site is an exact production duplicate of a system that has the capability to
seamlessly operate all necessary IT operations without loss of services to the end user of
the system.
 A hot site is a location that an organization may relocate to following a major disruption
or disaster. It is important to note the difference between a hot and redundant site.
Hot sites can quickly recover critical IT functionality; it may even be measured in
minutes instead of hours. However, a redundant site will appear as operating normally to
the end user no matter what the state of operations is for the IT program.
 A warm site has some aspects of a hot site, for example, readily-accessible hardware and
connectivity, but it will have to rely upon backup data in order to reconstitute a system
after a disruption. An organization will have to be able to withstand an MTD of at least 1-
3 days in order to consider a warm site solution.
 A cold site is the least expensive recovery solution to implement. It does not include
backup copies of data, nor does it contain any immediately available hardware.

Three concepts used to create a level of fault tolerance and redundancy in transition processing:

– Electronic vaulting: Refers to the transfer of backup data to an off-site location. This is
primarily a batch process of dumping the data through communications lines to a server at an
alternative location.

– Remote journaling: Refers to the parallel processing of transactions to an alternate site. A


communication line is used to transmit live data as it occurs.
– Database shadowing: Uses the live processing of remote journaling but creates even more
redundancy by duplicating the database sets to multiple servers.

Full Backups
A full system backup means that every piece of data is copied and stored on the backup
repository. Conducting a full backup is time consuming, bandwidth intensive, and resource
intensive. However, full backups will ensure that any necessary data is assured.

Differential Backups
Differential backups operate in a similar manner as the incremental backups except for one key
difference. Differential backups archive data that have changed since the last full backup. For
example, the same site in our previous example switches to differential backups. They lose data
after the Wednesday differential backup. Now only two tapes are required for restoration: the
Sunday full backup and the Wednesday differential backup.

Incremental Backups
Incremental backups archive data that have changed since the last full or incremental backup.
For example, a site performs a full backup every Sunday, and daily incremental backups from
Monday through Saturday. If data are lost after the Wednesday incremental backup, four tapes
are required for restoration: the Sunday full backup, as well as the Monday, Tuesday, and
Wednesday incremental backups.
Fast to create
Takes longer to recover

Time taken to create backup:


Full backup > Differential backup > Incremental backup

Time to recover from


Full backup < Differential backup < Incremental backup
Tape-Rotation Strategies

Tape-rotation strategies can range from simple to complex.

 Simple—A simple tape-rotation scheme uses one tape for every day of the week and then
repeats the pattern the following week. One tape can be for Monday, one for Tuesday,
and so on. You add a set of new tapes each month and then archive the previous month's
set. After a predetermined number of months, you put the oldest tapes back into use.
 Grandfather-father-son (GFS)—There are 3 sets of tapes:
7 daily tapes (the son), 4 weekly tapes (the father), and 12 monthly tapes (the
grandfather). Once per week a son tape graduates to father. Once every 5 weeks a father
tape graduates to grandfather. After running for a year this method ensures there are
backup tapes available for the past 7 days, weekly tapes for the past 4 weeks, and
monthly tapes for the past 12 months.
 Tower of Hanoi—This tape-rotation scheme is named after a mathematical puzzle. It
involves using five sets of tapes, each set labeled A through E. Set A is used every other
day; set B is used on the first non-A backup day and is used every 4th day; set C is used
on the first non-A or non-B backup day and is used every 8th day; set D is used on the
first non-A, non-B, or non-C day and is used every 16th day; and set E alternates with set
D.

Database Recovery
1. Electronic Vaulting
Electronic vaulting is the batch process of electronically transmitting data that is to be
backed up on a routine, regularly scheduled time interval. It is used to transfer bulk
information to an offsite facility. Electronic Vaulting is a good tool for data that need to
be backed up on a daily or possibly even hourly rate. It solves two problems at the same
time.
2. Remote journaling
Remote journaling setups transfer copies of the database transaction logs containing the
transactions that occurred since the previous bulk transfer. Data transfers still occur in a
bulk transfer mode, but they occur on a more frequent basis, usually once every hour and
sometimes more frequently
3. Remote Mirroring
A live database server is maintained at the backup site. The remote server receives copies
of the database modifications at the same time they are applied to the production server at
the primary site. Therefore, the mirrored server is ready to take over an operational role at
a moment’s notice.
Testing DR plans
Checklist/Read-through Test: Copies of the DR plan and continuity plan are distributed to each
functional area for review.

Structured Walk-Through/Tabletop Test: Group comes together to walk through scenarios in


detail.

Simulation Test: DR team or groups of employees come together to simulate a specific


scenario.

Parallel Test: Done to ensure that critical systems can perform adequately at the off-site
facility. The systems are moved to the alternate site and processing takes place.

Full Interruption Test: Original site is actually shut down and processing takes place at the
alternate site.