Beruflich Dokumente
Kultur Dokumente
Change Record
3
Reviewers
Name Position
Distribution
1
2
3
4
5
6
7
8
ii
Contents
Overview .…………………………………………………………………………….4
Executive Summary ……………………………………………………………4
Solutions Overview …………………………………………………………… 5
Data Guard and RecoverPoint Comparison ………………………………….…8
Disaster Recovery Requirement Scenarios ……………………………………..13
iii
Overview
Executive Summary
Oracle and EMC each offer disaster recovery solutions that can be utilized by Oracle
customers using EMC storage. Oracle offers Oracle Data Guard, a native Oracle
capability included with the Oracle Database Enterprise Edition. EMC offers
RecoverPoint. Through combining the best attributes of both of these technologies
Oracle and EMC together are able to offer a holistic solution for disaster recovery at
the lowest total cost. The motivation to write this short paper came from Oracle on
EMC client requests for a comparative assessment document that would compare
Oracle Data Guard and EMC RecoverPoint. The objective of this document is to
come up with a comprehensive list of criteria/features/capabilities for an effective
disaster recovery strategy and use this list as a framework for evaluating
RecoverPoint and Data Guard. Details of this comparison are provided in a later
section of this document.
An effective disaster recovery solution needs to be able to satisfy the service level
agreements for Recovery Time Objectives (RTO), Recovery Point Objectives (RPO)
and Oracle data corruption protection.
Recovery Time Objective (RTO) is the measurement of the amount of time that a
customer will not have access to data while the data is being recovered. Recovery
Point Objective (RPO) is the measurement of the point in time to which data must be
restored in order to resume processing business transactions. RTO and RPO for
mission critical business data need to be established while evaluating disaster
recovery solutions.
Data corruptions typically fall into two categories: Physical data corruptions and
Logical data corruptions. Physical corruptions are usually caused by failure of
different components that make up the infrastructure stack including HBA, File
System, Operating System, Memory, NIC, RAID, Switch, and Controllers etc.
Logical data corruptions are caused by human errors such as deleting wrong tables,
running wrong jobs, updating or deleting wrong set of tables and rows etc.
Requirements for tolerance to physical and logical data corruptions also need to be
established while evaluating disaster recovery solutions.
Oracle Data Guard is a compelling solution for Oracle customer environments with
the following requirements:
Solutions Overview:
For Oracle on EMC storage customers there are different solutions available today to
protect mission critical business data from disasters, data corruptions and other
primary site failures. An effective disaster recovery solution should be able to
quickly restore business operations and provide business continuity and data
protection while making effective use of all assets while in standby role.
Refer to the following link for more info on Oracle Data Guard:
http://www.oracle.com/technology/deploy/availability/htdocs/DataGuardOverview.html
EMC RecoverPoint
EMC RecoverPoint is a solution that can provide both business continuity and data
protection for Oracle database environments. RecoverPoint is an out-of-band
replication product from EMC. RecoverPoint leverages a fabric-based out-of-band
appliance approach that tracks every write that occurs in the source database storage
volumes and writes them in parallel to the local RecoverPoint appliance. All the
writes are journaled, pooled and replicated to the remote disaster recovery site to
provide Continuous Remote Replication.
EMC RecoverPoint uses time stamped history volumes and ‘Bookmark’ features to
provide bookmarked recovery points. These recovery points can be immediately
accessed and mounted back to production environments within a very short time.
RecoverPoint uses the following bandwidth reduction technology features: Delta
Differentials, Hot Spot Identification, De-Dup, and Algorithmic Compression. These
bandwidth reduction technology features can help reduce the overall network
utilization of the disaster recovery solution. Recovery Point also provides
capability to throttle the network bandwidth utilized by the disaster recovery
solution. RecoverPoint is database agnostic and can be used to replicate
heterogeneous database environments. RecoverPoint can also be used to replicate
mission critical data that is residing outside the database in operating system files.
There are no distance limitations with EMC RecoverPoint as well.
The following diagram illustrates Recover Point network based Continuous Remote
Replication.
n n n n n n
q
o o
nSAN WAN nSAN
Third-
Party
EMC p p EMC
Third-
Party
r r
n RecoverPoint splitter drivers p Journal
– Intercepts server writes (block-level) – Tracks all data changes to every protected LUN
– Resides on host or in fabric – Utilizes bookmarks for application-aware recovery
– Mirrors write to RecoverPoint – Repository for live data updates
appliance – Provisioned from existing SAN LUNs
o RecoverPoint appliance – Dynamically compressed, which saves storage
– Journal writes to History Volume q Provides advanced functionality
– Manages and prioritizes resources – Policy-based bandwidth management
– Compresses data for WAN transfer – 3–15-times data compression
– Distributes changes to remote site – Lowers cost of IP infrastructure
– Manages recovery processes
– Supports instant access to protected r Supports heterogeneous environments
data – Works with EMC and third-party storage
– True any-to-any volume replication
Refer to the following link for more info on EMC Recover Point:
http://software.emc.com/products/software_az/recoverpoint.htm
Data Guard and RecoverPoint Comparison
Comparison Criteria Data Guard RecoverPoint
Replication Type Host based replication where Oracle transmits Fabric Based Out-Of-Band Appliance that transmits and
redo data from primary database transactions applies changed blocks.
and applies the changes to the standby
database.
Supported Replication Maximum Protection (Synchronous), Continuous Remote Replication (CRR): Async Only
Modes Maximum Availability (Synchronous),
Maximum Performance (Asynchronous). All
modes support Continuous Data Protection
(CDP) using Oracle Flashback Database.
Recovery Point Yes. Maximum Protection and Maximum RecoverPoint does not support zero data loss since it
Objective (RPO): Can Availability modes support zero data loss. can’t support synchronous replication. Minimal data loss
it support zero data Minimal data loss is supported with is supported with Asynchronous mode.
loss ? Maximum Performance mode (ASYNC).
What gets transmitted Data Guard only transmits primary database RecoverPoint solution transmits primary database redo
over the network for an redo log writes to the disaster recovery site. log writes, data file writes, other online logfile member
Oracle database writes, archive log file writes, control file writes and
replication flashback log writes.
Recovery Time A Data Guard standby database is always RTO is largely dependent on the host reconfiguration in
Objective (RTO): Can mounted and in continuous recovery mode. the DR side. The Oracle database is not mounted,
it support zero down This enables very fast failovers, achieving activating the standby database as primary requires some
time? near zero downtime should the primary site time and involves manual administrative tasks.
fail.
Failover capability: Yes. The 'Fast-Start Failover' capability No. RecoverPoint does not automatically mount and
Support for Automatic allows Data Guard to automatically and open the database. It is an administrative task. For
Failover during quickly fail over to a standby database in the Continuous Remote Replication (CRR), the image at the
unplanned downtime? event of loss of the primary database. There secondary site must be promoted to a host on the
is no need for any manual steps to invoke the secondary site. Database needs to be mounted and
failover. When old primary is repaired Data opened.
Guard automatically reinstates it to be the new
standby database. Additional features in the
Oracle tech stack can be leveraged to enable
failing over the application as well.
Fail-Back capability With Flashback Database configured it is It is an administrative task, including replicating the
possible to quickly resynchronize the failed changed data back to the primary site, then mounting to
primary and fail-back to the primary side with the image as presented in the journal volume at the
zero data loss. primary site.
Switchover capability Data Guard provides an integrated switchover RecoverPoint provides switchover capability as well.
for planned downtime capability for planned maintenance such as This switchover procedure can be scripted using the
hardware or OS upgrades and patching. RecoverPoint CLI and the script can be invoked when a
Switchover can be executed with downtime switchover is needed. It can be done manually as well
measured in seconds because a Data Guard using the RecoverPoint GUI with a few clicks.
standby database is always mounted.
Distance Limitations / None None
Latency
Hardware/Software Data is transported from the primary side to RecoverPoint Appliances uses pairs of Active-Active
requirements for data the standby side over the standard TCP/IP servers that will write order, compress, and replicate the
transport between based network using existing Oracle Net data to pairs of RecoverPoint Appliances at the
primary and standby Services. secondary DR site. In addition, you need to use host
sites. based agent to do the write split or you can get SANTap
module from Cisco or Brocade to do write split at the
SAN switch level.
License cost for Available at no extra charge with the Oracle Licensed by Terabyte for RecoverPoint or Licensed by
Replication Database Enterprise Edition. Active Data Array pair for RecoverPoint /SE (for Microsoft Windows
Guard, which provides advanced features for / EMC CLARiiON environments).
read access to a synchronized physical
standby database, is a separately priced option
in Oracle Database 11g.
FC to IP conversion Not needed Not needed if the write splits are configured using host-
based drivers. If write splits are configured at the SAN
level, FC to IP conversion is completed automatically by
the RecoverPoint Appliance, which has an HBA
connection to the FC initiator and target, and also via
Ethernet GigE ports to the WAN.
Re-Purpose capability Logical Standby databases are always RecoverPoint allows the mounting of a Read/Write
of the database at DR mounted and open read-write, making them point-in-time image of the replicated data to a host on the
site for reporting or test very flexible for offloading the primary disaster recovery side for reporting purposes. This
purposes during normal database of ad-hoc queries and reports. A mounted copy and the reporting usage has no effect on
operations Physical Standby database is always mounted, the ongoing journaling and does not interrupt or interfere
and when using the Oracle Active Data Guard with the disaster recovery replication solution.
option, can be open read-only while being
actively synchronized with the primary
database. This enables read-only queries and
reports that need access to up-to-date
information to be offloaded to a physical
standby database. In addition, Data Guard
Snapshot Standby enables a standby database
to be opened read-write to process
transactions that are independent of the
primary database for test or other purposes. A
snapshot standby continues to receive and
archive primary database changes while it is
open read-write. This provides continuous
data protection, and the local archive is used
to quickly resynchronize the standby database
with the primary once test activity is
complete. These features allow standby
databases to be used to enhance primary
database performance and support test
activities, improving ROI without
compromising RTO/RPO objectives.
Compression In Oracle Database 11g, Data Guard Compression and De-dup features are available with
compresses data that is transmitted to resolve RecoverPoint for all the data transmissions over the
a redo gap, caused by the standby database network.
falling behind a primary database due to a
network disconnect or a standby server
outage.
Bandwidth usage Bandwidth usage throttling is not available Bandwidth used for replication can be throttled with
throttling capability with Data Guard. Data Guard design goal is RecoverPoint.
to reduce RPO. Bandwidth usage throttling
conflicts with that goal, since anything that
delays transmission will increase the risk of
data loss.
Server utilization at the Complete utilization of the server and storage With RecoverPoint, you can utilize the standby servers to
DR site at the DR side for ad-hoc queries, reporting, host other databases and applications. If RTO is flexible,
test, cloning and backup activities. Standby servers in the DR side can be fully utilized for test/dev
servers can also be used to host other and other purposes during normal operations. When
databases and applications while in standby there is primary side failure, the server in the DR side
role. needs to be re-purposed to serve as the new primary
database server. RecoverPoint does not allow you to
utilize the storage being consumed by the replicated
volumes as you can with Data Guard to offload the
primary database of ad-hoc queries and reports against an
up-to-date replica of the primary.
Network Utilization / Only redo data is sent over the WAN with Data files, control files and archived and online redo log
Bandwidth cost Data Guard (as opposed to sending data files, files needs to be replicated. In addition if the flash
redo files, control files etc). Since each recovery area is also on the source volumes then the flash
change on the primary side is only sent once, back logs will get replicated as well. So each change on
Data Guard transport mechanisms are the primary side is sent multiple times to the remote site.
designed to efficiently utilize network This increases the volume of data that must be sent over
bandwidth and minimize RPO. a WAN. RecoverPoint uses bandwidth reduction
technology features (such as Delta Differentials, Hot
Spot Identification, De-Dup, and Algorithmic
Compression) to reduce overall WAN utilization.
Oracle System Change Within an Oracle database, when a user RecoverPoint is not Oracle SCN aware. (Note: There
Number (SCN) commits a transaction, the transaction is was a one-off custom version of RecoverPoint for Oracle
awareness and assigned a System Change Number (SCN), 9i on Solaris 8 that was Oracle SCN aware. Going
capability to flashback which Oracle saves along with the forward with Oracle 10g, RecoverPoint is NOT Oracle
to a specific Oracle transaction’s redo entries in the redo log. SCN aware.)
SCN SCN aware Data Guard implementation takes
full advantage of Oracle flashback features
and provides granular recovery capabilities
and flashback to an SCN capability.
.
Capability to replicate Yes. In a Data Guard Logical Standby RecoverPoint replication is at the LUN level. It is
just a subset of the configuration, Oracle provided theoretically possible to replicate a subset of the database
database DBMS_LOGSTDBY package could be used if you isolate the subset of the database objects to
to skip applying DML or DDL to selected physically reside on separate LUNs.
tables or entire schemas in the standby
database. However, in general, Oracle
recommends Oracle Streams for more
granular replication requirements.
Oracle data corruption Data Guard only propagates the redo data, Data corruptions introduced on the primary production
protection & Resiliency and Oracle validates redo data before it is database will be replicated to the DR site since there is
applied to the standby database. Data Guard no Oracle data corruption detection possible with
detects physical corruptions that are caused RecoverPoint.
by the underlying storage subsystem that the
storage subsystem cannot detect. One
example is lost write detection feature in
Oracle Database 11g using a physical standby
database, that can detect if a lost write has
occurred on either the primary or standby
database. This ensures data protection from
various physical corruptions and eliminates
the possibility of corrupt data being
propagated to the DR site. Data Guard logical
standby (SQL Apply) can be used to provide
additional protection against logical
corruptions.
State of the Standby In a physical standby configuration, the In a RecoverPoint configuration, the target volumes that
database at the DR site standby database is always mounted. In a are being replicated are not mounted while the
during replication logical standby configuration, the standby replication is going on. During a fail-over, the
database is always open. In both cases, replication process is turned off, the database is mounted
data is validated before it is applied. and opened. This will be the first opportunity to evaluate
the condition of the database.
Bi-directional Replication is one direction at any given time. Replication is one direction at any given time.
replication of a
database
Support for A separate standby database is required for Yes. A single fabric-based replication strategy with
Federated DB each distinct primary database. RecoverPoint can more easily handle Federated DB
architectures.
Network outage impact No impact to primary database. No impact on the primary database.
on primary database
availability
Rolling database Yes. Oracle Data Guard SQL Apply supports No support for rolling database upgrades.
upgrades with minimal database software upgrades for major release
downtime and patchset upgrades (from Oracle Database
10g onwards) in a rolling fashion with near
zero database downtime.
Support all Operating Yes Most operating systems supported. Refer to EMC
Systems support matrix.
Pause and Restart Yes Yes
replication easily?
Support for Yes Yes
heterogeneous storage
environments between
primary and DR site
Different Oracle Same release of Oracle Enterprise edition is Same release of Oracle required on both Primary and DR
versions between required on primary side and all the standby side.
Primary and DR side? side. Only exception is during rolling
database upgrades.
Number of vendors Since Data Guard is part of the Oracle Multiple vendors involved in the implementation of a DR
involved in the DR Enterprise edition, a DR strategy using Oracle strategy for an Oracle database. RecoverPoint introduces
strategy for an Oracle Data Guard requires less integration since it a replication technology that is database agnostic and
infrastructure does not have any other vendor dependency. will work with heterogeneous storage arrays. Any
component swap has implications to DR protection and
overall robustness of the solution.
Consistency group Data Guard is intended to protect individual Yes. EMC offers a capability to group all the LUNs that
across multiple Oracle databases only. Cannot provide make up the entire EMC storage infrastructure into a
applications, databases storage infrastructure wide continuous data consistency group and perform a consistent split on
and file systems for protection. multiple application, database, and file system related
storage infrastructure LUNs. This capability provides continuous data
wide continuous data protection for the entire storage infrastructure.
protection.
Disaster Recovery Requirement Scenarios
Below are multiple scenarios to help provide tangible examples of use cases in which Oracle Data Guard and
EMC RecoverPoint can be combined to deliver a holistic solution for disaster recovery.
Scenario 1:
Several mission critical databases having RPO’s/RTO’s that range from zero data loss RPO and an RTO of 60
seconds during production hours, to RPO’s/RTO’s measured in minutes. Additionally, there is a requirement
for fast point in time recovery up to 8 hours prior to the current point in time to recover from user error. Data
protection and availability is very important – thus any capability that can protect against hardware induced data
corruptions is highly desired. The applications using these databases require high Quality of Service for
executing business transactions – thus it is also desirable to offload the primary database of the overhead of long
running ad-hoc queries and reporting that require read access to the current information (the replica must be
within 15 seconds of the primary database). Additionally, the customer would like to reduce downtime required
while implementing database upgrades, and reduce storage requirements for additional test databases.
Solution for Scenario 1: Oracle Data Guard physical standby database in Maximum Availability Mode using
synchronous replication is configured to meet the most stringent RPO requirement. Fast-Start Failover is
configured to meet the most stringent RTO requirement. For RPO/RTO in minutes – Data Guard Maximum
Performance mode is used. Oracle data validation, including Oracle Lost-Write Detection, is utilized to prevent
data corruptions. Flashback Database with an 8-hour retention period is used to meet the CDP requirement. The
standby databases are used to offload long running ad-hoc queries and reporting during peak production hours
using Active Data Guard (this allows physical standby databases to remain synchronized while open for read
access). During off-peak hours the same standby databases are used to offload the primary of doing fast
incremental backups using Oracle Recovery Manager block-change tracking. In emergencies or during off-peak
hours, the standby databases can be open read/write for testing database changes and application fixes using
Data Guard Snapshot Standby, and then quickly resynchronized with their primary database when testing is
complete. Data remains protected the entire time the snapshot standby is being used for test purposes (the
primary database continues to ship data to the snapshot standby where it is archived for protection and used to
resynchronize the standby database once testing is complete). During periods of planned maintenance, database
rolling upgrades are executed using the same physical standby databases and the Data Guard 11g feature called
Transient Logical Standby.
Scenario 2:
The customer has 10 different production databases hosted on a single SAN. The databases are a mix of Oracle
and non-Oracle DBMS. RPO and RTO of up to 8 hours are acceptable. There is no concern for consolidating
storage required to offload reporting, testing or backups. The primary database workload is light – with no
concern for offloading queries, reports, or other long-running operations. There is no concern for rolling database
upgrades – extended periods of planned downtime are acceptable. Network bandwidth is limited and is a
primary concern, thus more data loss exposure and downtime are acceptable trade-offs for a solution that can
compress all data transmission. Given the less stringent requirements for data protection and availability, a single
replication solution is preferred to address databases from multiple vendors as well as data residing outside the
database in operating system files.
Solution for Scenario 2: Fabric-based replication strategy using EMC RecoverPoint with compression and
bandwidth throttling would be very ideal for this scenario.
Conclusion
With over 55,000 joint customers, including each other, Oracle and EMC provide the information infrastructure
for most of the world’s most critical information. Oracle and EMC both provide customers various options for
data protection and availability. When combining the best of both technologies from Oracle and EMC, IT
organizations can deploy a robust enterprise disaster recovery strategy at the lowest cost. It is recommended to
use the guidelines and detailed analysis contained in this document to analyze your cost, recovery point and
recovery time objectives and to design the solution that is best for your organization.
Disaster Recovery Strategies for Oracle on EMC storage customers
Contributors: Shankar Jayaganapathy, Bill Gaynor, David Wallace, Matt Sebastian, Jamie Shikiya, and Alex D’Anna