Академический Документы
Профессиональный Документы
Культура Документы
automated recovery
Simplify and accelerate nondisruptive disaster recovery testing
Increase ROI by leveraging existing landscape investments in DR
Table of contents
Executive summary............................................................................................................................... 5
Business case .................................................................................................................................. 5
Solution overview ............................................................................................................................ 6
Key results ....................................................................................................................................... 7
Introduction.......................................................................................................................................... 8
Purpose ........................................................................................................................................... 8
Scope .............................................................................................................................................. 8
Audience ......................................................................................................................................... 8
Terminology ..................................................................................................................................... 8
Technology introduction..................................................................................................................... 10
Overview ........................................................................................................................................ 10
EMC RecoverPoint .......................................................................................................................... 10
EMC Symmetrix VMAX 10K ............................................................................................................. 10
Virtual Provisioning ................................................................................................................... 10
EMC VNX5300 ................................................................................................................................ 11
EMC Unisphere .............................................................................................................................. 11
VMware vCenter Site Recovery Manager ......................................................................................... 11
SAP ERP 6.0 ................................................................................................................................... 12
Basics of RecoverPoint replication ..................................................................................................... 13
Replication modes ......................................................................................................................... 13
Main components .......................................................................................................................... 13
EMC RecoverPoint appliance ..................................................................................................... 14
Write splitters ............................................................................................................................ 14
Replication sets and consistency groups ................................................................................... 14
Storage Replication Adapter ...................................................................................................... 14
Journal volumes ........................................................................................................................ 14
Initial replication ............................................................................................................................ 15
Solution architecture .......................................................................................................................... 16
Overview ........................................................................................................................................ 16
Physical design .............................................................................................................................. 16
Hardware resources ....................................................................................................................... 17
Software resources ........................................................................................................................ 18
Virtual machine resources.............................................................................................................. 19
SAP application environment ......................................................................................................... 20
Executive summary
Business case
For most businesses, the growth in eCommerce and focus on consumers experience
has led to more opportunity and competition. Simply stated, businesses have
expanded beyond traditional geographic boundaries and markets, interacting with
consumers directly. Capitalizing on this opportunity requires:
Responsiveness: Consumers have choices. If they are not satisfied with the
service level they receive, they go elsewhere to find what they need, leading to
severe productivity loss.
Ultimately, these business requirements translate into the challenges for SAP IT
including:
Traditional disaster recovery (DR) solutions, such as an offsite tape restore, may no
longer be adequate to address increasing recovery time objectives (RTO) and recovery
point objectives (RPO). Storage replication solutions can satisfy these requirements.
However, maintaining an idle data center for disaster recovery may not be suitable for
some organizations either. These challenges demand a solution that offers effective,
affordable, and efficient disaster recovery protection for these critical SAP business
functions, without wasting any previous investments.
Solution overview
This solution provides customers with a strategy that enables DR, using VMAX and
VNX series interchangeably as production and DR storage for SAP applications.
The solution team designed the architecture for central SAP installations, distributed
implementations (database, central services, and application are on separate virtual
machines), and heterogeneous EMC storage solutions such as VNX and VMAX
families.
We implemented this using a Symmetrix VMAX 10K storage array together with an
existing VNX5300 storage array to form a bi-directional disaster recovery solution. We
achieved this by replicating both storage devices to each other with EMC
RecoverPoint and designing automated recovery plans using VMware vCenter Site
Recovery Manager (SRM).
This solution demonstrates the following capabilities:
Automated failback
This solution facilitates automated reversal of the data centers roles after
failover. After the original production site is operational, failback is just a
matter of performing another failover.
Key results
This solution significantly improves business agility and scales out for many common
needs for SAP production and nonproduction systems:
Introduction
Purpose
This white paper demonstrates how EMC RecoverPoint can be integrated seamlessly
with VMware vCenter SRM to perform seamless recovery and storage migration
between EMC VNX and EMC Symmetrix VMAX systems.
It describes the benefits in the following scenarios:
Scope
Audience
Terminology
After migrating business data from VNX to Symmetrix VMAX 10K, relocating the
existing VNX5300 from the production site to the recovery site
Configuring two active production data centers to mutually protect each other
Terminology
Term
Definition
Asynchronous replication
CDP
CLR
Term
Definition
Consistency group
CRR
Deduplication
Gatekeepers
Journal volume
KPI
PIT
Replica volume
Replication set
Repository volume
RPA
RecoverPoint appliance.
RPO
RTO
Synchronous replication
Technology introduction
Overview
Key components
Component
Role
EMC RecoverPoint
EMC VNX5300
EMC Unisphere
DR failover coordination
Note
While Symmetrix VMAX and VNX models were used in the solution testing,
this solution is generally applicable to other models as well.
EMC RecoverPoint
EMC Symmetrix
VMAX 10K
The Symmetrix VMAX 10K system is built on the highly scalable EMC Virtual Matrix
Architecture, which enables the storage to grow seamlessly and cost-effectively from
an entry-level, single-bay, single-engine configuration to a six-bay, four-engine
system with 512 GB cache memory, 1,080 drives, and up to 1.5 PB usable capacity.
Built for simplicity and ease-of-use, Symmetrix VMAX 10K is 100 percent virtually
provisioned, with storage tiering automatically managed by EMC Fully Automated
Storage Tiering for Virtual Pools (FAST VP). Symmetrix VMAX 10K supports Flash,
Serial Attached SCSI (SAS), and Near-Line SAS (NL-SAS) drives with RAID 1, 5, and 6
protection options.
Symmetrix VMAX 10K systems are preconfigured and ready for same-day installation
and startup. They support all EMC Symmetrix monitoring and management tools,
including the latest enhanced Symmetrix Management Console, which provides
simpler installation and management with smart wizards.
The integrated RecoverPoint splitter for Symmetrix VMAX 10K enables local and
remote replication for flexible and efficient RPOs and RTOs.
Virtual Provisioning
EMC Virtual Provisioning simplifies storage configuration and management,
improves capacity utilization, and enhances performance by enabling the creation of
thin devices. Thin devices present more capacity to applications than what is
physically allocated in the storage array.
10
Physical storage that supplies disk space to the thin devices comes from a thin pool.
The thin pool is composed of data devices that provide the actual physical storage.
You can add or remove data devices to grow or shrink thin pools nondisruptively.
Virtual Provisioning supports local and remote replication with RecoverPoint on VMAX
10K.
Virtual Provisioning is described in detail in EMC Solutions Enabler Symmetrix Array
EMC VNX5300 storage array is a member of the VNX series storage platforms, which
delivers innovation and enterprise capabilities for file, block, and object storage in a
scalable, affordable, and easy-to-use solution. Designed to meet the highperformance, high-scalability requirements of midsize and large enterprises, the VNX
series enables enterprises to grow, share, and cost-effectively manage multiprotocol
environments.
The VNX series is powered by quad-core processors, making it two to three times
faster than its predecessor. The VNX processors support the demands of advanced
storage capabilities such as Virtual Provisioning, compression, and deduplication.
VNX arrays incorporate the RecoverPoint splitter, which supports unified file and
block replication for local data protection and disaster recovery.
EMC Unisphere
EMC Unisphere is the central management platform for EMC Symmetrix VMAX and
VNX series, providing a single combined view of file and block systems, with all
features and functions available through a common interface. Unisphere is optimized
for virtual applications and provides industry-leading VMware integration. It
automatically discovers virtual machines and VMware ESX servers and provides
end-to-end, virtual-to-physical mapping.
VMware vCenter
Site Recovery
Manager
VMware vCenter SRM is a disaster recovery framework that integrates with EMC
RecoverPoint to automate recovery of VMware datastores so that it becomes as
simple as pressing a button.
SRM is an extension to VMware vCenter that enables integration with array-based
replication, discovery, and management of replicated datastores, and automated
migration of inventory from one vCenter to another. SRM does not replicate any data,
but, instead, leverages an external replication solution such as RecoverPoint. SRM
servers coordinate the operations of the replicated storage arrays and vCenter servers
on the production and recovery sites so that, as virtual machines on the production
site are shut down, virtual machines on the recovery site start up and resume the
responsibility of providing the same services, using the data replicated from the
production site.
Migration of protected inventory and services from one site to the other is controlled
by a recovery plan that specifies the order in which virtual machines are shut down
and started up, the computing resources that are allocated, and the networks they
can access. Integrated with RecoverPoint, SRM also enables testing of recovery plans,
using a temporary copy of the replicated data, in a way that does not disrupt ongoing
operation on either site.
11
SAP ERP 6.0, powered by the SAP NetWeaver technology platform, is a world-class,
fully integrated enterprise resource planning (ERP) application that fulfills the core
business needs of midsize and large enterprises across all industries and market
sectors. SAP ERP 6.0 delivers a comprehensive set of integrated, cross-functional
business processes and can serve as a solid business-process platform that supports
continued growth, innovation, and operational excellence.
12
Continuous Data
Protection (CDP)
PRD
Continuous Remote
Replication (CRR)
PRD
Concurrent Local
and Remote
Replication (CLR)
Local
replica
Remote
replica
Local
replica
Remote
replica
PRD
Concurrent local and remote replication (CLR) for combined local and remote
protection
Write splitters
Journals
13
14
Each consistency group has its own set of journal volumes, which allows for differing
retention periods across the consistency groups. Each consistency group has two or
three journal volumes assigned to it:
Initial replication
Marking: If the RPAs are unable to transfer to the replica journal, the location of
the changes is stored in the RPAs as well as on the production journal volume.
When contact with the remote site is reestablished, the remote replica is
synchronized, but only at those locations that were marked as having changed.
No marking/no replication: The splitter does not write to the RPAs. This can be
caused by a manually disabled consistency group or by a disaster on the
production site (no RPAs are available).
15
Solution architecture
Overview
This solution used a Symmetrix VMAX 10K storage array with an existing VNX5300
storage array to form a bi-directional disaster recovery environment. In this solution,
we replicated both storage devices to each other with EMC RecoverPoint and
designed automated recovery plans using SRM.
Physical design
Figure 2.
Solution architecture
16
We used two data centers to host SAP systems VM2 and VM3:
1.
Site A: Contained VMAX 10K; production site of VM2 and recovery site of
VM3.
2.
We configured the two pairs of RPAs contained in each data center (RPA1 and
RPA2 on site A, RPA3 and RPA4 on site C) as a cluster.
3.
4.
5.
VM2 SAP IDES ERP, Oracle, Suse Linux, distributed SAP installation
VM2 was initially installed on the VNX5300 array and then migrated to the
VMAX 10K array. This DR solution is also used in migrating SAP systems in
VNX to VMAX storage.
VM3 SAP IDES ERP, SQL Server, Windows, central SAP system installation
VM3 was independently installed on VMAX, and then migrated to the VNX
array. This use case demonstrates bi-directional recovery.
6.
Hardware
resources
Hardware
Quantity
Configuration
Storage (Production
site A)
17
Hardware
Quantity
Configuration
Storage (site C)
VMware ESXi
server
Network switch
1 Gb Ethernet switches
FC switch
RecoverPoint
appliance
Software resources Table 4 details the software resources for the solution.
Table 4.
Software
Version
Purpose
1.0.0.1018
1.1
7.4
EMC RecoverPoint
3.5
VMware vSphere
5.0.0 B623860
18
Virtual machine
resources
Software
Version
Purpose
VMware vCenter
5.0.0 B755629
5.0.1 B633117
2.0
2.6.32.24
2008 R2 *64
Role
Purpose
Quantity
Configuration
Critical production
Critical production
Critical production
Non-critical production
19
SAP application
environment
Specifications
VM2
VM3
Number of
datastores, type
2, VMFS
1, VMFS
Operating system
Windows 2008 R2
Database
application
Size
SAP application
Installation type
Distributed
Central
20
Solution configuration
Overview
Configuring
replication
Replication configuration
SAP-specific customization
2.
3.
b.
c.
Configure RecoverPoint.
a.
Install RecoverPoint.
b.
Configure RecoverPoint access to the splitters on the VMAX 10K and VNX
arrays.
c.
d.
Configure SRM.
a.
b.
c.
d.
e.
f.
g.
h.
21
not limited to array-based replication. However, whenever you use an array basedreplication as in this solution, each of the arrays LUNs replicates in one direction
only. Two LUNs in the same array can replicate in different directions from each other.
Customization
specific to SAP
2.
3.
4.
b.
c.
This solution used the flexibility of the SRM recovery plan by inserting operating
system commands that perform the previous tasks during a failover. This fully
automates the startup process until all SAP systems are ready for login.
To achieve the desired level of automation, the test environment was prepared as
described in the following sections.
Heterogeneous test environment
We considered diversity when we built the test environment for a more extensive
scope and more credible results. This includes SAP releases, types of databases and
operating systems. We also tested the solution across components with non-uniform
time zones.
22
Replication mode
We set up CLR in both VM2 in the consistency group SiteA_PRD and VM3 in SiteC_TST,
as shown in Figure 3 and Figure 4.
Figure 3.
Figure 4.
23
Startup priorities
SRM allows virtual machines to be started up in a particular order by assigning a
priority. Priority 1 virtual machines are started up first, followed by priority 2 virtual
machines, and so on. We changed the priorities of the virtual machines as follows:
Priority 1ASCS; to make the SAP shared folders available for mounting
Priority 2Database instance; to ensure that the database is started before the
SAP application instances are started
Figure 5.
Network settings
We configured two static IP addresses for each virtual machine:
You can achieve this by configuring two network interface cards (NICs) for each virtual
machine or by creating a script that binds the internal IP addresses as virtual. We
used the former in this solution.
/etc/hosts file
SRM updates only the entry of the virtual machine's own hostname. Ensure that the
internal IP addresses are used before and after recovery. Steps are as follows:
1.
Create a hosts.i file as a template that maps the hostnames to the internal IP
addresses.
2.
Insert a script as a post-startup recovery step to reset the hosts files with
contents from the hosts.i file to keep consistency.
24
Figure 6 shows the scripts that were applied to the automated application startup
after powering on.
Figure 6.
25
Figure 7.
Hosts file
For SAP logon, we created two entries; one is for logging in to the production site A,
the other for the recovery site C, as shown in Figure 8. Note that full IP addresses will
not be shown in Figure 8 and Figure 9.
Figure 8.
SAP logon
We created two logon groups using transaction code SMLG, which corresponds to the
two sets of IP addresses. Users can just switch the logon group used in the SAP GUI
after the failover, as shown in Figure 9.
26
Figure 9.
Load-generation environment
We used LoadRunner for every test in this solution to run successive updates in SAP
tables and to simulate a real-world environment. The timestamps on the table entries
inserted by LoadRunner served as counterchecks of the timestamps recorded in SRM
or RecoverPoint, but at the SAP level. Timestamps from LoadRunner updates provide
more accurate application-level RTO and RPO key performance indicators (KPIs) for
testing. The tests were considered successful when the results from
SRM/RecoverPoint and SAP were consistent.
Bi-directional recovery testing
We performed the recovery tests on both VM2 and VM3 systems to validate bidirectional recovery. However, because the procedure was the same, for brevity this
white paper shows only the screenshots of the more complex VM2 system.
System integrity and quality control
We took timestamps carefully to ensure consistency among all components. We
performed SAP checks before and after testing for each scenario according to the
best practices to ensure that the failed-over system was healthy technically and
functionally.
27
Note
These controls were observed, but the raw data and screenshots are not
shown in this paper.
28
In the context of operational efficiency and effectiveness, this section aims to test
and demonstrate the following capabilities:
Disaster recovery
This solution allows the client to quickly automate the entire failover operation
with a few clicks of a button.
Failback testing
Because site A may be geographically closer to users, or applications running
previously on it are presumably more resource intensive, failback to the original
site is recommended. This operation reinstates failed-over systems on VNX to
their original servers as soon as possible. This design allows for rapid failback
to reduce downtime for the operation.
PIT recovery
The solution aims to show how easy it is to perform PIT recovery, eradicating
the need to restore and roll forward entire databases just before the logical
corruption occurs.
Test scenario 1:
Nondisruptive
failover testing
Test scenario
An organization originally has a central site C that contains SAP production systems
VM2 and VM3 on VNX5300. A second auxiliary site A that contains VMAX 10K was
built. VM2 will be failed over/migrated from site C (VNX5300) to site A (VMAX 10K)
using this solution.
To prepare the migration, we simulated disaster recovery failover on VM2 and VM3 on
site C while both were in production.
Test objectives
The test objectives were:
Performing failover testing on the recovery site without shutting down any
virtual SAP system
29
Test procedure
The test procedure was as follows:
1.
2.
Took timestamps and performed test recovery on VM2 and VM3. During
testing, we did a health check on both systems to verify integrity.
3.
Logged in to the SAP GUI, took the timestamp, and performed an SAP health
check.
4.
Checked the availability logs of VM2 and VM3 on site C to see if the
production systems had gone down during testing.
Findings
Table 7 shows the findings of this test.
Table 7.
Success criteria
Findings
Result
Passed
Passed
Passed
Passed
Initiation of the testing required only three steps. Additional manual steps were
not needed until completion, as shown in Figure 10.
30
1
1
12
3
1
Figure 10.
Figure 11.
Conclusion
A fully consistent bi-directional SAP DR was easily made available for DR testing in
less than 30 minutes while primary sites were available for production users with
continued DR replication.
EMC Transforms IT for SAP: Fully Automated Disaster Recovery
EMC Symmetrix VMAX 10K, EMC VNX5300, EMC RecoverPoint, VMware vCenter Site Recovery
Manager
31
Test scenario 2:
Disaster recovery
Test scenario
After a successful simulation in scenario 1, the same production SAP system VM2 on
site C was failed over to site A for planned migration/disaster recovery. To simulate
the disaster, we terminated the database processes and triggered disaster recovery
from site C to site A.
Test objectives
The test objectives were:
Test procedure
The test procedure was as follows:
1.
2.
3.
Logged in to the SAP GUI, took the timestamp, and performed an SAP health
check.
4.
Findings
Table 8 shows the findings of this test.
Table 8.
Success criteria
Findings
Result
Passed
Passed
Passed
Initiation of the testing required only four steps. Additional manual steps were
not needed until completion, as shown in Figure 12.
32
41
31
Figure 12.
33
Figure 13.
Conclusion
The actual failover required only four steps. A fully consistent SAP DR was available in
less than 30 minutes. Manual intervention was not needed after the recovery was
initiated. This technique can be used to migrate systems from VNX to VMAX or vice
versa.
Test scenario 3:
Failback
Test scenario
VM2 was on site A after recovery in scenario 2. Assume that site C is now fully
operational after it has crashed. We then re-established protection between the two
sites, but with the roles reversed for VM2: Site A was the protected site, replicating to
recovery site C. VM3 remained unchanged.
Test objectives
The test objectives were:
Test procedure
The test procedure was as follows:
1.
2.
3.
Logged on to the SAP GUI, took the timestamp, and performed an SAP health
check.
EMC Transforms IT for SAP: Fully Automated Disaster Recovery
EMC Symmetrix VMAX 10K, EMC VNX5300, EMC RecoverPoint, VMware vCenter Site Recovery
Manager
34
4.
5.
Findings
Table 9 shows the findings of this test.
Table 9.
Success criteria
Findings
Result
Reprotection is 100%
successful
Passed
Passed
Passed
Passed
Initiation of reprotection required just four steps. Additional manual steps were
not needed until completion, as shown in Figure 14. Then another four steps
were needed for the failover.
1
12
31
41
Figure 14.
Reprotection procedure
35
Figure 15 shows how reprotection reversed the direction of replication so that further
post-processing was no longer necessary.
Figure 15.
36
Figure 16.
Conclusion
Reprotection needed only four steps and the actual failback needed another four
steps. All DR changes to databases and file systems were reinstated to the original
production site in less than 30 minutes.
Test scenario 4:
Point-in-time
recovery
Test scenario
The system was corrupted because of an application logic error after a transport. We
performed a point-in-time (PIT) recovery to restore the system before the transport.
Test objectives
The test objectives were:
37
Test procedure
The test procedure was as follows:
1.
Populated the table VBAK and captured a timestamped screenshot of the final
entries.
2.
3.
4.
5.
a.
b.
Recovered production.
c.
d.
Resumed production.
e.
Logged on to the SAP GUI, displayed the table VBAK, and captured a
timestamped screenshot of the final entries.
38
Findings
Table 10 shows the findings of this test.
Table 10.
Success criteria
Findings
Result
Passed
Passed
Passed
Figure 17.
Figure 18 shows that the final bookmarked entry was Sales Doc 3451802 inserted at
23:12:15. Figure 18 shows the System Status window on the right-hand side.
Figure 18.
39
Figure 19 shows that user activity resumed after bookmarking and the table was
populated further. We captured this image approximately 6 minutes after capturing
Figure 18.
Figure 19.
Figure 20 shows the RecoverPoint status in the middle of the restore process. We
captured this image 10 minutes after capturing the previous image.
Figure 20.
40
Figure 21 shows the status after the PIT recovery. Note that the result of the PIT
recovery is correct and consistent, even if the time zone of the Management GUI
(Figure 20, Windows time) is 8 hours ahead of the RecoverPoint server time and 12
hours ahead of the SAP GUI time. The latest entry was the same sales document, that
is, Sales Doc 3451802.
Figure 21.
Conclusion
Data recovery from a corruption can occur in minutes without the need of a full
database restore or file system recovery.
41
Conclusion
Summary
Findings
This solution offers effective, affordable, and efficient disaster recovery protection for
critical business functions without wasting previous investments. This strategy
significantly improves business agility and scales out for common SAP production
system and test system needs, providing:
Initiating the failover testing requires only three steps. A fully consistent bidirectional SAP DR is easily made available for DR testing in less than 30
minutes, while the production systems remain completely undisrupted.
Additional manual steps are not required until completion.
The actual failover requires only four steps to initiate. A fully consistent SAP DR
is available in less than 30 minutes. Manual intervention is not required after
the recovery is initiated. This technique can be used for VNX/VMAX migrations.
Initiation of reprotection requires only four steps. The actual failback requires
another four steps. All DR changes to databases and file systems are reinstated
to the original production site in less than 30 minutes.
Data recovery from a corruption can occur in minutes without the need of a full
database or file system recovery using RecoverPoint point-in-time recovery.
42
The solution implements control measures to protect against human errors and
technical limitations:
We performed the tests using super user accounts for convenience, but the
solution is capable of role-based access control (RBAC) to manage
configuration and administrative activities in a more granular sense, if needed.
For more information, refer to Site Recovery Manager AdministrationvCenter
Site Recovery Manager 5.1.
While the entire failover process is automated, the trigger itself is manual. This
ensures that a failover can be initiated only after proper processes are followed,
executive approvals have been granted, and a state of disaster (or resumption
of normal operations) has been declared.
The failover operation can be triggered with one or two steps. However, we put
confirmation prompts in place to prevent inadvertent executions.
43
References
White papers
Product
documentation
SAP notes
Business Continuity and Disaster Recovery for Oracle 11g Enabled by EMC
Symmetrix VMAX 10K, EMC RecoverPoint, and VMware vCenter Site Recovery
Manager
EMC Solutions Enabler Symmetrix Array Controls CLI Version 7.3 Product Guide
44
CDP continuously captures and stores data modifications locally, enabling local
recovery from any PIT with no data loss. CDP supports both synchronous and
asynchronous replications. Figure 22 illustrates the CDP workflow.
Figure 22.
CDP workflow
2.
The splitter splits (duplicates and distributes the write request) to both the
protected volume and RPA.
3.
(3a) RPA acknowledges back to the splitter when receiving the write. (3b) At
the same time, RPA moves the data into the local journal volume, along with
the timestamp and any bookmarks for the write operation.
4.
After the write request has been secured in the journal, it is then applied to
the target replica volumes. The write order is preserved during distribution.
45
Continuous remote Continuous remote replication (CRR) supports synchronous and asynchronous
replication between remote sites and a wide area network (WAN). CRR supports
replication
synchronous replication when the remote sites are connected through Fibre Channel
(FC) and provides zero RPO. Asynchronous replication provides crash-consistent
protection and recovery to specific points-in-time, with minimal RPO.
Figure 23 illustrates the CRR workflow.
Figure 23.
CRR workflow
2.
The splitter duplicates the write and sends it simultaneously to the protection
volume and to its local RPA.
3.
EMC Symmetrix VMAX 10K, EMC VNX5300, EMC RecoverPoint, VMware vCenter Site Recovery
Manager
46
Concurrent local
and remote
replication
4.
When the package is received on a remote site, the receiving RPA verifies the
checksum first to ensure integrity and then decompresses the data.
5.
The remote RPA writes the data to the journal volume on the remote site.
6.
The write is then distributed to the target replica volumes through an RPA FC
connection after the data has been stored safely in the journal. The write
order is preserved during distribution.
CLR is a combination of CRR and CDP and provides concurrent local and remote data
replication. Figure 24 illustrates the CLR workflow.
Figure 24.
CLR workflow
47
2.
The splitter duplicates the write and sends it simultaneously to the protection
volume, and to its local RPA.
3.
3b. The write request is then sent to the local journal and applied to the
local replica volume. This completes the CDP sequence.
4.
When the package is received on a remote site, the receiving RPA verifies the
checksum first to ensure integrity, and then decompresses the data.
5.
The remote RPA writes the data to the journal volume on the remote site.
6.
CDP is normally used for PIT recovery where business data or application logic has
been severely compromised. On the other hand, CRR is normally used for disaster
recovery. CLR thus implies that two replicas are made for every volume protected
(local and remote) to protect from both scenarios.
48
Configuring the RecoverPoint splitter on the storage array requires the following
volumes to be provisioned beforehand.
Step 1: Creating LUNs and replica volumes
Create the following LUNs for RecoverPoint usage:
One repository volume (at least 3 GB) for each RPA cluster.
One replica volume each in the protected and recovery site for every
consistency group.
Three journal volumes (production journal, local journal, and remote journal
volumes) for every site.
Figure 25 shows how we created the gatekeeper volumes using Unisphere for the
VMAX GUI.
Figure 25.
The size of a replica volume must be equal to or greater than the size of the
protected volume. For example, if the protected volume is 500 GB, CDP or CRR
consumes an additional 500 GB and CLR consumes at least 1,000 GB,
excluding the journal volumes. This means a CLR consumes twice as much as a
CDP or CRR alone.
The journal size depends on the RPO. Longer RPO requires larger journal sizes.
49
SAP systems falling under the same consistency group will failover as one
bundle. Assigning a separate consistency group to each SAP system is highly
recommended.
The size of the journal volumes should reflect the RPO. You need to determine
the expected peak change in the environment to calculate the minimum size of
the journal volume.
) (
Note
Target side log size is 20 percent (0.2) according to the best practice and an
additional 5 percent is allocated (1.05) for internal use.
For example, to calculate the minimum journal size to satisfy a 48-hour rollback
requirement (172,800 seconds) with an estimated 10 Mb new data writes per second:
(
) (
(
)
)
50
Figure 26.
Alternatively, the following command can be executed from the Solution Enabler for
every device to be tagged:
symconfigure sid <Symmetrix ID> -cmd set dev <device list or
device range> attribute=RCVRPNT_TAG; commit
Figure 27.
51
Figure 28.
It is a best practice to tag also the RecoverPoint gatekeepers, journal volumes, and
repository volumes.
Configuring
RecoverPoint
Figure 29.
Step 2: Configuring RecoverPoint access to the splitters on the VMAX 10K and VNX
arrays
You need to configure the VMAX 10K and VNX splitters in RecoverPoint to bridge
the replication. We did this by using the New Splitter wizard in RecoverPoint through
Splitter > Add New Splitter, as shown in Figure 30.
52
Figure 30.
Figure 31.
53
Configuring SRM
Which site should be considered as the production site and which should
be considered as the recovery site?
Installation prerequisites
Before installing SRM, ensure that the following prerequisites are met:
Each site must contain a vCenter server with at least one vSphere data center
being managed.
The recovery site must support array-based replication with the protected site.
The recovery site must have access to the same public and private networks as
the protected site, though not necessarily having access to the same range of
network addresses.
Step 1: Installing SRM virtual machines and the SRA for integration to RPA
Perform the following steps:
1.
Install a database server on each site for SRM use. The databases will store
the recovery plans and inventory information.
2.
54
3.
4.
Install the SRM plug-in on at least one vSphere client. SRM configurations are
transparent to both sites, but it is highly recommended to install the SRM
plug-in on both sites.
For detailed steps, refer to VMware vCenter Site Recovery Manager Installation and
ConfigurationCenter Site Recovery Manager 5.1.
Step 2: Establishing connections between protected and recovery sites
Before configuring SRM, change the consistency groups management mode to
Group is managed by SRM, RecoverPoint can only monitor. You can change this on
the Policy tab after selecting the consistency group, as shown in Figure 32.
Figure 32.
When this is completed, the two SRM servers must be connected in SRM to establish
reciprocity. You can do this through the Connect to Remote Site wizard, by specifying
the IP/port details and authenticating credentials of the other (remote) SRM server,
as shown in Figure 33.
55
Figure 33.
The new SRM interface contains an easy-to-follow guide tab that explains the steps
throughout the configuration process.
Step 3: Configuring RecoverPoint array managers
After the protected and recovery sites have been paired, configure the RecoverPoint
array managers (through the SRAs installed earlier) for SRM to discover replicated
devices, compute datastore groups, and initiate storage operations. To do this, use
the Add Array Manager wizard, as shown in Figure 34.
56
Figure 34.
Typically, you need to perform this step only once after the sites are connected.
Additional reconfiguration is not needed unless there is a change in connection
information or credentials.
After this step, all configurations can be done on just one SRM and will automatically
be synchronized to the other SRM server.
Step 4: Configuring protection groups
You can create a protection group after configuring the array managers.
A protection group is used to organize virtual machines that should be failed over
together, such as in this solution where the SAP database, SAP Central Services
instance, and three application instances are failed over as one entire set.
You can create many protection groups, but all virtual machines assigned to a
protection group must belong to the same datastore group (consistency group).
The Create Protection Group wizard automatically detects the datastores protected by
RecoverPoint and the virtual machines inside it. You need to specify which virtual
machines among the list must be added to a protection group, as shown in Figure 35.
57
Figure 35.
Site A
Figure 36.
In this solution, we created a separate protection group for VM2 and VM3 respectively,
because both systems belong to different datastores and are cross-replicating toward
each others site.
Step 5: Configuring inventory mappings and placeholder virtual machines
As shown in Figure 37, the Resource mappings tab allows you to define which
resources on the recovery site will serve as the counterparts to the ones on the
protected site, such as network, resource pool, and virtual machines. SRM cannot
58
protect a virtual machine unless it has valid inventory mappings for key virtual
machine resources.
SRM derives the resource assignments of a newly created placeholder from the
inventory mappings established on the protected site.
EMC recommends that you group virtual machines into folders for easier
management.
Note
1
3
4
2
Figure 37.
59
2.
By using the dr-ip-customizer utility in the bin folder of the SRM installation
path, generate the existing IP table and specify the file name and location
where it will be saved.
C:\Program Files (x86)\VMware\VMware vCenter Site Recovery
Manager\bin>dr-ip-customizer.exe cfg ..\config\vmwaredr.xml cmd generate o <csv filename to generate> vc
<VMware vCenter server hostname>
3.
Edit the CSV file and customize the existing IP settings according to the IP
settings on the recovery site. Figure 38 shows the CSV file opens in a typical
spreadsheet application. Note that full IP addresses are not shown in this
figure.
Figure 38.
4.
60
Figure 39.
Test: Online simulated recovery; SRM performs all recovery steps inside a test
network so that the test can run without disrupting the virtual machines in the
production environment.
Recovery: The actual migration of virtual machines in the recovery plan to the
recovery site. There are two types that differ only in error tolerance:
61
Figure 40 shows an SRM recovery plan. These steps change depending on the
function.
Figure 40.
Recovery plans are flexible and customizable. You can make readjustments to the
workflow as needed, such as:
Customizing timeouts to allow some time for a step to complete before the next
step
Creating message prompts to force the plan to pause until the user
acknowledges
Adding custom recovery steps for the SRM server or the recovered virtual
machine using scripts
62