DE SIMPL Sries Disaster Recovery
By Johnson Koruthu Pal amkuzhi yi l
You all know about Business Continuity & Disaster recovery. Disaster recovery seems pretty self-explanatory but
Is tere any !ifference between !isaster recovery an! business continuity plannin" - #ets explore furter. Disaster
recovery as been in existence since te $%&'s( ever since mainframes ave been in use an! many of us
un!erstan! wy it critical to a business.
) !isaster can be terme! to as *i+ a complete outa"e or *ii+ a partial !isruption wic causes essential , vital
resources incapable of !eliverin" teir !esire! outcome or *iii+ a critical interruption tat may be cause! by a
natural catastrope( pan!emic outbreak( uman-in!uce! inclu!in" an act of terrorism or an unnatural event
occurrin" from an environmental or structural test wic tri""ere! an unpre!icte! can"e to tat environment or
Business Continuity !escribes te processes an! proce!ures tat an or"ani-ation as lai! !own to ensure teir
essential , vital functions can continue !urin" an! after a !isaster as occurre!. Business continuity plannin"
seeks to prevent or minimi-e interruption of critical services an! to re-establis te full functionin" of services
.uickly as possible.
/ence( !isaster recovery plannin" in Information 0ecnolo"y *I0+ is a subset of Business Continuity tat ensures
an or"ani-ations essential , vital resources are protecte! an! secure! in te event of a !isaster tus ensurin" tat
te business can swiftly restart operations wit minimal impact to !ata an! enable !epen!ents of te application
access to all teir essential resources
)ltou" business continuity is important to any enterprise( it may not be practical for all or"ani-ations but te
lar"est by business an! revenue to a maintain a full functionin" solution trou"out a !isaster crisis. 0e first step
in business continuity plannin" is !eci!in" wic of te or"ani-ation1s functions are essential , vital an!
apportionin" te available bu!"et accor!in"ly. 2nce te crucial components are i!entifie!( fail-over mecanisms
can be put in place. 3ew tecnolo"ies( suc as "lobal mirrorin" an! backup on te clou!( make it feasible for an
or"ani-ation to maintain up-to-!ate copies of !ata in "eo"rapically !isperse! locations( so tat !ata access can
continue uninterrupte! even if one location is !isable!.
In many cases( businesses are prone to i"norin" !isaster recovery because a !isaster seems an unlikely event
an! as remote cances of occurrence. /ence( business continuity plannin" soul! su""est a more
compreensive approac to makin" sure an enterprise can keep teir business active( not only after a natural
calamity but also in te event of smaller but impactin" !isruptions inclu!in" illness or su!!en !eparture of key
staffer*s+ an! sometimes oter callen"es like a me!ium-level !isruption tat businesses face from time to time.
0erefore business continuity can also be !escribe! as te process of ensurin" continuance of vital functions. )
tri""er tat impacts business continuity can initiate from any source 4 some tou"t-trou" examples below
sow ow simple may seem to be but capable of causin" spiralin" effects tat can put te business at risk5
*i+ ) key staff workin" on a critical pro6ect wit a""ressive timeline su!!enly falls ill an! te pro6ect comes to a
complete stan!still
*ii+ ) can"e wic is performe! to up"ra!e an application or platform su!!enly fails to meet its !esire! outcome
an! roll-back is not an option or te files nee!e! to restore to te previous version was foun! invali!
*iii+ ) container loa!e! on a trailer rolls off te i"way an! crases into an electricity supplyin" station !isruptin"
power supply to an entire block wic fee!s a!6acent !ata-center tat electricity nee!e! to maintain te air-
con!itionin" insi!e te !ata center.
)n I0 Business Continuity plannin" is te process of i!entifyin" tose "roups or part*s+ of te business tat are
!eeme! essential , vital an! nee!s to be restore! in te sortest available time to ensure minimal impact to
business an! its customers. Business Continuity 7ana"ement *BC7+ also referre! as Business Continuity
8lannin" *BC8+ for I0 services an! I0 infrastructure is te combination of 9ork-force Continuity 7ana"ement
*9C7+ an! )pplication Continuity 7ana"ement *)C7+
) Business Continuity soul! mainly consist of : elements *a.+ Resilience *b.+ Contin"ency & *c.+ Recovery. 0us(
a business continuity plan soul! inclu!e5 a business resilient plan( wic specifies an or"ani-ation1s planne!
strate"ies !urin" an! imme!iately after a !isaster as occurre!. It soul! also ave a business resumption plan(
tat specifies a means of maintainin" essential services at te crisis-it location; ) contin"ency plan( wic
specifies a means of !ealin" wit events tat can seriously impact te or"ani-ation; an! ) !isaster recovery plan(
wic specifies a means of recoverin" te business functions at an alternate location.
0erefore( an a!vance!-level BC8 in an I0 environment must plan for te Resilience( contin"ency & recovery of
information tecnolo"y !ata( pysical assets( facilities an! uman capital in te event tat an unpre!icte!
!isruption occurs( capable of causin" severe impact to business( lea!in" to loss of revenue or criminal , civil
lawsuits. ) BC8 nee! not be specific to terrorist inci!ents or apply to 6ust one ma6or !isruption suc as a ma6or
fire( floo!in" or power fault but soul! contain an overall contin"ency plan wic covers all suc inci!ents.
A seasoned professional with 15 years of total IT service management experience. I started in Customer
service for an Internet Service Provide (ISP) serving local customers and thereafter for other organiation
and products. After ! years" I moved to Infrastructure #anagement (IM) for an organiation that was
providing the necessary infrastructure to ena$le SaaS applications to do their $usiness on the cloud. I
moved to I%# from there and I have $een in I%# for over 5 years now.
In &'1'" I moved to (isaster )ecovery and since then I have continued as their offering program manager
for essential * vital applications that are hosted in the +S. I,ve $een in client-facing roles where I represent
the delivery and compliance as their Single-Point-of-Contact (SP.C) for the %usiness Continuity /
(isaster )ecovery (BC/DR) program.
This document is $ased on my experience" learning and o$servations as a () Program manager and I
certainly expect to hear feed$ac0 and comments from experts" readers and anyone who is interested in
0nowing or sharing more a$out the su$1ect and helping me to improvise.
A disaster recovery solution can $e expensive and may not $e ade2uately utilied in terms of productivity
and profita$ility when compared to $usiness gains from a production environment which ma0es it one of
the reasons why organiations may not $e paying enough attention to this solution. I hope to convince
readers how important a disaster recovery program is to their organiation and li0e in every organiation
which has dedicated departments that head their Security" 3uality * Compliance" infrastructure" (isaster
recovery should $e a$le to grow in those line. In fact () solutions can $e made a far less expensive or
$etter utilied. A simple example is the pay-as-you-go plan over Cloud.
Planning: It is recommen!e! tat all essential , vital applications ave a Business Continuity 8lan *BC8+ to
minimi-e te impact to business in te event of a !isaster. 9ile BC8 for an I0 infrastructure consist of 9ork
=orce continuity *9C7+ an! )pplication continuity wic inclu!es system an! application recovery. 7y current
role is to mana"e an! supervise te Disaster Recovery *DR+ solution for applications tat ave cosen a DR plan
an! in provi!in" assistance to elp customers meet teir business continuity "oals. 0o !o tat I1m involve!
trou" all pases startin" wit solution !esi"n pase wic recommen!s te most suitable an! cost-effective DR
solution tat satisfies te business nee!s by i!entifyin" a solution base! on Business Impact )nalysis *BI)+.
Implementation: )n I0 infrastructure BC8 can consist of an active-active solution*solutions inclu!e a ot site or
rapi! recovery or i" available !ata+ or an active-passive solution *wit a col! site or off-site or stan!by available
!ata+. /owever a DR solution in most cases( re.uires a parallel infrastructure to meet te business continuity
nee!s. 0ere are several types of DR infrastructures to cose from !epen!in" on te business nee!s. 0e DR
infrastructure can be on a stan!-alone !e!icate! infrastructure or a sare!-oste! environment to te most latest
concept( te clou! base! !isaster recovery. Depen!in" on te criticality to business( eac business unit or te
wole can cose to ave 6ust a part of te infrastructure wic is most essential , vital or te wole environment in
a DR plan an! it can also be locate! witin te same unit of te !ata center *DC+ or can be locate! at least >'
miles away from te primary unit *tis is i!eally recommen!e! for lar"e applications+. In some case( businesses
prefer to ave teir DR infrastructure locate! in a !ifferent ?eo but tis involves massive cost for !ata
transportation an! businesses woul! also nee! to ali"n to te policies of tat re"ion in a!!ition to te policies of
teir local re"ion. In or!er to implement te ri"t solution( businesses nee! to i!entify teir key-critical
components( resource an! applications tat !rive teir business. 0ey also nee! to consi!er reliability(
re!un!ancy an! 1reacability1 *in simple terms( te ability to .uickly recover in te event of a !isaster+ of te DR
service provi!er,site to cose te correct solution. I assist te business in !evelopin" a DR plan tat best fits teir
business "oals
3ote5 Data mirroring is also a solution deployed under BCP where a near-continuous availability is needed and its can be
brought up even when a disaster has not occurred.
Valiation: a DR exercise plan *DR@8+ consist of a se.uence of activities clearly !efine! in te time-line an!
nee!s to be execute! wit precision in or!er to meet te !esire! Recovery 0ime 2b6ective *R02+. @ac task can
be assi"ne! to !ifferently skille! resources tat may nee! to be performe! eiter in se.uence or parallelly wit
anoter set of task. /avin" a well-!efine! time-line is one-alf of te success criteria of a DR test. /ence a DR
time-line is table! !ocument witin a DR@8 tat !efines eac task( its se.uence an! everyone wo nee!s to be
part of tat task. =urter( te DR@8 also contains te escalation pat for bot system an! application support
personnel alon" wit !etails of te solution !esi"ne! for te business to ensure te application is restore! from
"roun!--ero witin te "iven R02. 0is solution nee!s to be teste! an! vali!ate! torou"ly an! re"ularly to
ensure te solution woul! be effective wen te business nee!s it. 0e DR planners elp te business in testin"
teir DR solution an! recor!in" te results alon" wit any fin!in"s wic !etermines ow effective teir DR
solution is. ) DR test will also elp in i!entifyin" te best approac an! calculate te actual time nee!e! to
recover( tus settin" te win!ow for recovery in te event of a true !isaster. )ny !efects i!entifie! !urin" testin"
are miti"ate! !urin" or imme!iately after a DR test an! te DR plan is up!ate!. ) DR test can consist of a live full
system also know as complex system test in a controlle! environment *at an off-site DR infrastructure+ or in a
maintenance win!ow *at te primary location itself+ or trou" a paper test *wic is cate"ori-e! as an offline
test+. 0e test involves coor!inatin" between multiple infrastructure an! application !elivery personnel an!
ensurin" te application is torou"ly teste! before re-allowin" en!-users access te application
Do!"mentation: /avin" correct an! up-to-!ate !ocumentation is anoter crucial factor to te success of te
recovery. 0e !ocumentation as to meet a specific format in or!er to be easily un!erstoo! by all personnel
involve! in te recovery an! it soul! be store! were it can be easily accesse! in te event of a !isaster yet
meet all internal an! external I0 "overnin" policies to ensure only autori-e! personnel ave access to te
!ocument. 0is is also one of te re.uirement in most re"ulatory policies suc as A2B( or BACD%%% *now IA2
CC:'$+. It is e.ually necessary to ensure te role( responsibility an! tasks of eac personnel are clearly
!ocumente! to avoi! any confusion !urin" restoration of te system an! application. I ensure tat te
!ocumentation is reviewe! an! ten approve! before publisin".
Complian!e: 9ile many I0 stan!ar! an! re"ulatory boar!s as well as most internal I0 policies ave clear
!efine! stan!ar!( it is important to meet tose "ui!elines an! stan!ar!s. 9e ensure tat te DR8( DR @xercise
plan !ocument an! DR results !ocument meet te I0 stan!ar! an! policy un!er wic te application is "overne!
S"pport/Maintenan!e: 0is involves "ettin" all personnel from te system an! application si!e into one lar"e
"roup an! sce!ulin" !iscussions to vali!ate te DR@8( DR@R & DR8. 0is is anoter crucial step to te success
of a DR test as well as recoverin" te application in te event of a !isaster. @ac personnel clearly nee!s to be
aware of teir task an! wen to execute tem( failin" wic te DR test or te recovery woul! fail. 0e important
task for te offerin" mana"er is to ensure all teams participate trou" te DR pre-test preparation wic usually
start :-$$ weeks from te sce!ule! !ate of testin" an! continue trou" te entire system or paper test pase
wic en!s *test pase+ wit te application comin" back online. 0e resource continue to be en"a"e! for post
DR review were te issues i!entifie! !urin" te test are recor!e! in te problem lo" an! action plan is i!entifie!
to miti"ate te issue permanently. 2nce te primary infrastructure is operational( !ata nee!s to be transporte!
back or mi"rate! from te DR site to te primary site wit minimal or no impact.
#e$ %pport"nities: 9ile primary applications woul! scale teir business on nee! basis( it is e.ually important
for te DR solution to scale in accor!ance to te primary infrastructure. /owever( te business can coose if tey
continue to re.uire a bare DR infrastructure( in wic case te up"ra!e is limite! or tey nee! to matc everytin"
at te primary site wic involves i"er cost wic may not necessarily be consume! or utili-e! a!e.uately to be
6ustify for small applications. 0e application may also a!! new feature or pro"rams tat become increasin"ly
essential , vital over a perio! of time an! some tat may no more be essential , vital to business. ) proper DR plan
nee!s to consi!er all tese aspects to ensure te ri"t business areas an! functionality is covere! in te scope of
te DR plan. 7y role is to elp te customer assess teir business capacity an! plannin" nee!s an! propose new
solutions or up"ra!e teir existin" solution wic is usually !one !urin" te yearly contract renewal !iscussions.
%n Clo": 9ile you may ave ear! Disaster Recovery as a Aervice *DRaaA+ on Clou! computin"( it can be
terme! as a solution tat can be ma!e easily available like an 12n-!eman! service1 wic will elp keep te cost
lower tan avin" a !e!icate! infrastructure for DR as well as make it available at sorter win!ow. /owever(
clou! infrastructure is all about avin" te ri"t balance between system resources( type of business an! location
of te primary business unit.
