Вы находитесь на странице: 1из 32

Wipro Storage Practice

Issue and Challenges


In BCP/DR Implementation

Wipro Infotech CUSTOMER FIRST


SRIKANT ATTRAVANAM
CONSULTANT – STORAGE SERVICES
Evolution of BCP
80s 90s 00’

Traditional Dot.com E-business


Business Focus
Restore n
Requirements recover High Avail. 24X7

Regulation Mission critical Competition


Driven by (app. Driven )

Recovery Hardware Hardware


Days/Hours minutes/sec Continuous
Expectations

Optional Mandatory
Decision

Source : http://www.snia.org
Information Service Availability

BCP
Disaster Recovery

A Replication
V
A
I Local High Availability (Clustering)
L
A
B
I Disk/Volume Management (Mirroring)
T
Y

Stable Backups

INVESTMENT
Information Service Availability

The time to restore business operations are a function of


COST, COMPLEXITY and AVAILABILITY REQUIREMENTS

Cont. availability
C
O
S Mirroring
T Sync. replication

C
O
M
Rapid recovery
P
L
E
X Async
I replication
T
Slow recovery
Y

Tape backup

Sec/min Min/hours Hours/days


Availability Myths of 9’s

% uptime % Downtime Downtime / Year Downtime / Week


98 2 7.3 days 3.3 hrs
99 1 3.65 Days 1.7 hrs
99.9 0.1 8.75 hrs 10.1 mts
99.99 0.01 52.50 mts 1 mt
99.999 0.001 5.25 mts 6 sec
99.9999 0.0001 31.50 sec 0.6 sec

If there are 7 components in the solution with each


99.99%, overall availability comes down to 99.93%
Enablers of BCP (Disaster)

Some of the basic enablers required for BCP are protection for :-

HARDWARE What a DISASTER


Means for your org ?

APPLICATIONS

NETWORK
(IP AND STORAGE)

Disaster recovery is the invocation of previously developed


Policies and procedures for data recovery when a threat turns into a
REAL EVENT.
Key elements to enable BCP

 Hardware Protection
Some of the common techniques used today are

» RAID
» Clustering

 Data Protection
Some of the common techniques used today are

» Backups to secondary media


» Snapshots
» Replication

 Disaster recovery Planning


Disaster Recovery Planning (DR)

 A disaster is any event that disrupts normal business


processes.

 A disaster recovery plan is a set of procedures to


» Avoid or reduce the risk of a disaster
» Minimize the effects of a disaster
» Quickly re-establish business critical processes after a disaster

 Disaster recovery planning involves an analysis of


business processes and the attendant continuity needs.
Why is DP and DR so important?

 Data is increasingly the


lifeblood of today’s enterprise.
 Business liability depends
heavily on the safety and
accessibility of data.
 Data drives competitive
advantage and market share.
 Government and legal
regulations mandate data
protection and privacy.
 Murphy’s Laws: “Anything that
can go wrong will go wrong”.
“If anything simply cannot go
wrong, it will anyway”.
Source: http://www.drplanning.org/
Before embarking on a DR plan..

It is important to remember that……


Buy in from the management is crucial for success.
 It is better to have an alternate plan than none at all.
» Look beyond “just backups”
 DR planning is a continuous process.

 Consult with peers in your industry who have or are


looking at implementing a DR plan.

 DR for IT infrastructure and services is only one


component of an effective business continuity plan.
 No single plan can fit every organizations goals.
Key to successful DR Implementation

 Pre-Planning, Project Initiation


 Risk and Business Impact Analysis
 Requirements Definition
 Mitigation Plan Development
 Evaluate DR/DCP Solutions
 Solution Implementation
 Testing/Exercising
 Sustenance Program
Pre-Planning, Project Initiation

 Building up the “Management Buy In”


» Document the impact of an extended loss to operations and key
business functions;
» Illustrate past failures and their adverse impact on the organization and
it’s customer base.
» Build a case around protecting data in accordance with government
regulations or meeting legal requirements that help a corporate avoid
liability.
» Describe how the plan can be used as an advantage in the marketplace.
 Reduces the risk of losing existing and new customers due to
downtime
 Reduces client risk.
 Shows the company’s organizational maturity
» NPV (Net Present Value) is a compelling reason for assessing the value
of managing risk.
» Provide management with a comprehensive understanding of the total
effort required to develop and maintain an effective recovery plan*.
» Get an agreement of support from the CIO/CEO.
Pre-Planning, Project Initiation

 DR Planning Awareness Program

» Evangelize the need for and benefits of a DR plan to


all the affected constituents.

 Schedule Interviews

» Develop an understanding of the company’s business


processes. This should cover as many of the
business units and locations as possible.
» Assess the resources available internally.
» Build support within your organization, for the success
of the projects and further maintenance.
PROFESSIONAL DR TEAM FORMATION
Risk and Business Impact Analysis
 A Disaster Recovery Plan should prioritize and assess the impact of
these (and other) possible events:
» Computer Software or Hardware Failures
» Power Disruptions, Failure
» Computer Shutdowns due to Worms, Viruses, Hackers etc.
» Loss of Key Personnel
» Natural Disasters (Flood, Storms, Earthquakes, Fire)
» Man Made disasters (Civil strife, Terrorist Acts, International War)
 A Business Impact Assessment Report should document the impact
of the above events on
 Every major business operation (Customer Support, Sales,
Services, HR, Finance, Marketing, Legal etc.)
 Critical applications and systems
 List of restored functions in order of priority

Define “DISASTER” for every IT asset (h/w, application)


Requirements Definition

 Document existing systems and processes. This should cover at a minimum


» Information Systems (Email, DB, File Servers, ERP etc)
» Network and Operations Services
» Voice Communications
» Technology Support
» Key Business Units Priorities and Processes
» Analyze the BIA report

 Use the following metrics for developing the requirements


» RPO (Recovery Point Objective)
» RTO (Recovery Time Objective)
» RCA (Risk Coverage Allocation)
 Personnel, Capital, On-going costs, Capital Costs
» NPV of Risk.

 Develop the base project management plan


» Dependency based

Prioritize all your requirements


Recovery Metrics

Weeks Days Hours Mins Secs Secs Mins Hours Days Weeks

Lost Data
Time To Resume Business

Recovery Point Objective Recovery Time Objective


(Max data loss you can tolerate) (Max downtime you can tolerate)

Cost Cost
Define and freeze upon the SLAs for all applications
Mitigation Plan Development

 Plan Scope, Objectives and Assumptions


 Assemble Team
» Define Team Responsibilities and Roles
 All critical functions should have multiple owners
» Maintain a Personnel Directory
 Every team member should have a hard copy.
» Plan Progress Binder

 Develop prevention processes


» Maintain good general housekeeping
» Observe physical security procedures
» Observe information security procedures
 Recovery Preparedness
» Cold Sites and Hot Sites
» On Site Spares
» Remote skills and manpower allocation
» Latest set of Recovery Documentation
Evaluate DR/BCP Solutions

 Classifying solutions
» Primary(local) vs. Remote
» Active vs. Passive

 Classify servers/applications/databases

 Local Solutions
» High Availability through RAID, Clustering.
» Internal vs. External Storage
» SAN Attached Storage subsystems.
» RAID across multiple storage subsystems.
» Backup to Tape or Disk.
 Media issues
 Failure rate of backups.
 Recovery issues.
» Snapshots
» Extended SAN Solutions (MAN) via DWDM
 Remote Solutions (>120 KM)
» Offsite Storage of Tape
» Replication
 Second site
 Service provider
» SAN Extensions
 FCIP
Evaluate DR/BCP Solutions

 Primary(Local) Vs Remote
» High Availability through local or WAN based clusters.
» Application Level Clustering. (Databases etc), could be Active-Active
» Replication solutions
 Volume, File and App level
» Distance Limitations
» Storage based replication or Host Based replication

 Active-Passive
» Standby servers
» No performance impact on app
 Distance agnostic
 Infrastructure
» UPS, Generators
» Use multiple vendors for WAN connectivity
» Backup lines

Generate CBA for these solutions and arrive at the best !!!
Branch Office: Second Data Center
Example DP/DR Scenario

10 TB
DAS
WAN
Monitoring
Public/Private
Firewall + Router

Firewall + Router

Backup Server
Service Provider: Remote DR Site

Tape
Database Servers App Servers Cold Site – Secure
Library
Offsite Tape Vault
NAS Filer

Redundant SAN
Fabric

Storage Arrays
SAN Attached Tape
Libraries
Network/WAN based considerations

Data Replication to remote sites requires additional planning

•IP Address Space


•DNS Resolution
•Redundant DR Links

The distance between sites determines the options


Multiple Link providers
At least one high speed link desirable

•Initial Data synchronization strategy

A practical observation - With reasonable latency


and 30% network overhead,one can expect to
transfer 630MB / hour on an E1 link
An Info Tabloid

 Moving 10 TB requires:
2.25 hours using OC-192 (10Gb/s)
9 Hours using OC-48 (2.5Gb/s)
14 hours using “2G” FC (1600 Mb/s)
28 hours using “1G” FC (800Mb/s)
35.7 hours using OC-12 (622 Mb/s)
6 days using OC-3 (155 Mb/s)

……If the pipe is fully utilized!


Solution Implementation

 Detailed Project Planning


 Solution Training and Trials – Test bed
 Scheduling
» Downtime
» Vendor/Service provider coordination
» DR Drills on test bed
 Preventive measures
» Strict Change Control procedures for every production
application
>> Backup data before changing
Testing/Exercising

 Testing Goals and Strategies


» Define the test purpose and approach
» Identify the test team
 Use non DR team members (INVOLVE END USERS )
» Match DR requirements
» Prioritize
 Most important first vs. Least impact first.

 Testing Procedures
» Detailed recovery procedures for every restore and possible disasters.

 Initial Test Report


» Detailed report of testing plan.

 Analyzing test results and modify the plans as appropriate


» Retest!

TEST AND DOCUMENT


Sustenance Program

»Establish a Corporate DR Cycle


 Should include all the above considerations
 Periodic DR Drill
»Result evaluation by a review council
 Include top management
»Document short comings and failures
 Gaps between requirements and objectives met
»Retraining
 Refreshing the DR team
»Upkeep the recovery documentation
periodcally
Look Out for …

 Lack of management commitment.


 Under allocation of resources.
 Lack of periodic testing of the plan.
 Lack of upkeep on the documentation.
 Failing to adapt plans to organizational changes.

» New systems
» Different Priorities
Never take DR related decisions in hurry
Patience and persistence are most
Important cornerstones to achieve tough “milestones”.
…Slow and steady can still win the race….
Why Wipro

• Strong Technical Expertise across multiple product platforms


• Large pool of technical resources
• Extensive Experience in Systems Integration
• Well Defined processes in place
• Breadth of Service Offering
• Best-in-class Service Infrastructure
• Strong & enduring relationship with Principles
People

Cisco Certified Professionals


CCIE 06
Wipro Accredited Service Eng. 1200 + CCNP 40
CCDP 10
Oracle 40 +
CCIQ 10
Sun certified Engineers 100 + CCNA 140 +
3 COM 25 +
Cluster (SUN, Veritas) 50 Wireless (Breeze Com and Cisco) 15 +
NORTEL (Alteon) 10
CA certified engineers 40 + Cisco/HPOV/CA-TNG 50 +

Veritas/ Legato 22 Domains Team Strength


Linux 25 + 750 +
e-Commerce ------------------------------
150 +
iPlanet (Directory & Messaging) 15 + ERP/SCM/CRM -------------------------
Data Warehousing ---------------------- 100 +
CITRIX 11 Infrastructure Management ---------- 500 +
Network Specialists ---------------------
Storage (SUN, HDS & EMC) 18 140 +
Client / Server ---------------------------
Legacy -------------------------------------- 1500 +
Project Management 25 System Software ------------------------- 1000 +
+ Telecom/Datacom ----------------------
300 +
ASIC Design -----------------------------
Wipro Storage Solution Stack

Secondary Storage Solutions Value


Value
AddAdd
Solutions
Solutions DR Solution

Tape Backup Policy Solution Application


DR Solution
Solutions Solutions Design Integration Migration

SRM
DR Strategy
Solutions
Primary Storage Solutions
Server Replication Risk
Disk Solutions
DAS NAS OS Solutions
SAN
Solutions Solutions Analysis

Storage Networking Migration Clustering DR


(FC Switches & HBAs ) Solutions Solutions Consulting

End - to- end solutions provider

Complete Solution Design and Architecture

Project Management

Best of Breed Technology and Products

Strong Practices and Processes


Wipro System Integration
Partial Customer List

DELHI METRO MTNL

BSNL

NIC
Cognizant Sys

H D F C BANK

Goodlass Nerolac

South Indian
Bank Ltd. Karnataka
Bank Ltd.
WE HAVE DONE IT IN PAST….
WE ARE DOING IT AT PRESENT….
and
WE ARE GEARED UP FOR FUTURE…

THANK YOU

Вам также может понравиться