Вы находитесь на странице: 1из 79

Chapter 9

Business Continuity Planning and


Disaster Recovery

BCP and DR (770)


An organization is dependant on resources,
personnel and tasks performed on a daily
bases to be healthy and profitable. Loss or
disruption of these resources can be
detrimental. Causing great damage or
even complete destruction of the business.
Business MUST have a plan to deal with
unforeseen events.

BCP and DR (770)


Business Continuity Planning is a broad approach
to ensure that a business can function in the
event of disruption of normal data processing
operations.
Disaster Recovery Planning is a subset of BCP.
The goal of a DRP is to minimize the effects of a
disaster and take necessary steps to ensure that
the resources, personnel and business
processes are able to resume operation in a
timely manner.

Terms for This Chapter


Business Continuity Plan a document
describing how an organization responds to an
event to ensure critical business functions
continue without unacceptable delay or change.
Business Continuity Planning Planning to help
organizations identify the impacts of potential
data processing and operation disruptions and
data loss, formulate recovery plans to ensure the
availability of data processing and operational
resources.
(more)

Terms
Business Impact Analysis Process of analyzing
all business functions within the organization to
determine the impact of a data processing
outage.
Business Resumption Planning BRP develops
procedures to initiate the recovery of business
operations immediately following and outage or
disaster.
(more)

Terms (pg 665 ISC book)


Contingency Plan a document providing the
procedures for recovering a major application or
information system network in the event of an
outage or disaster.
Continuity of Operations Plan A document
describing the procedures and capabilities to
sustain an organizations essential strategic
functions at an alternate site for up to 30 days.
(more)

Terms
Crisis Communications Plan A document that
outlines the procedures for disseminating status
reports to personnel and the public in the event
of an outage or disaster.
Critical System The hardware and software
necessary to ensure the viability of a business
unit or organization during an interruption in
normal data processing support.
(more)

Terms
Critical Business Functions The
business functions and processes that
MUST be restored immediately to ensure
the organizations assets are protected,
goals met and that the organization is in
compliance with any regulations and legal
responsibilities.
(more)

Terms
Cyber Incident Response Plan strategies to
detect, respond and limit the consequences of
cyber incidents.
Disaster Recovery Plan A plan that provides
detailed procedures to facilitate recovery of
capabilities at an alternate site.
Disaster Recovery Planning The process to
develop and maintain a disaster Recovery Plan
(more)

Objectives of the BCP (771)


The objectives of BCP are the following
Provide an immediate response to emergency
situations
Protect lives and ensure safety*
Reduce business impact
Resume critical business functions
Reduce confusion during a crisis
Ensure survivability of the business
Get up and running ASAP after a disaster

Business Continuity Planning

BCP Overview (771)


The goal of a BCP is ultimately to help a
company resume operating of business
functions as soon as possible after a
damaging event. If you think about it, a
BCP is really part of the larger security
program. As such a BCP should be part of
the security policy*

Steps in BCP (overview) (772)


ISC states 5 Phases in BCP. We will outline them now, and
detail them later.
1. Project Initialization establish a project team and
obtain management support
2. Conduct BIA identify time-critical business processed
and determine maximum outages
3. Identify Preventative controls
4. Recovery Strategy identify and select the appropriate
recovery alternatives to meet the recovery time
requirements.
(more)
.

Creating the BCP (overview) (772)


5. Develop the contingency plan document the
results of the BIA findings and recovery
strategies in a written plan
6. Testing, Awareness, and Training establish
the processes for testing the recovery
strategies, maintaining the BCP, and ensuring
that those involved are aware and trained in
the recovery strategies.
7. Maintenance Maintain the plan

BCP: Phase 1 (776)


Project Management and Initialization:
In this step
we must solidify managements support, because
without management support, NOTHING will be
successful.
Develop a Continuity Planning Policy
Statement lays out the scope of the BCP
project, roles and members, and goals.
(more)

BCP: Phase 1 (776)


We then must identify a Business
Continuity Coordinator* (the BCP team
leader)
Establish a BCP team
What types of people/roles should be on the
team Can anyone think of certain positions
that should make up the team? (pg 776)
Which people will be chosen for the team
(more)

BCP: Phase 2 (BIA) (778)


Phase 2 of the BCP steps is to conduct a
Business Impact Analysis. In short this
step is to outline what procedures and
resources the company depends on, how
important each processes is and how long
the business can do without each
resource. The formalized step are
conversed next.

Phase 2: BIA (overview) (778)


1. Select individuals to interview to determine
what processes* we have to protect
2. Create data gathering techniques to gather
data about these processes
3. Identify the companies critical business
functions/processes
4. Identify the resources these processes depend
on
(more)

Phase 2: BIA (overview) (778)


5. Calculate how long these functions can
survive without these resources
6. Identify vulnerabilities and threats to
these processes
7. Calculate the risk for each business
process
8. Document findings and report them to
management

BCP Phase 2: Step 1 (779)


Determine Information Gathering
Techniques

In this step the BCP committee needs to


identify the types of people that will be part
of the BIA gathering sessions.
These people should represent the different
departments that make up the business.
After determining the general roles, we need
to actually find the actual employees that
fill these roles, so we can interview them.

BCP Phase 2: Step 2 Select


Interviewees
In this phase the BCP team must create
data gathering techniques to use when
interviewing and gathering other
information to support the BCP objectives.
(surveys, questionnaires etc)

BCP Phase 2: Step 3 Identify


Critical Business Functions
Based on the information gathered by the
interviews and the data gathering
techniques, we need to now identify which
business processes and functions are
critical for the successful operation of the
business.

BCP Phase 2: Step 4 Analyze


information
One we know what the important processes
are we need to determine what are the
resources* that these processes depend
upon. These resources can be all kinds of
things such as servers, data, people,
buildings etc! (not just IT related things)
Determine cost whether qualitative or
quantitative

BCP Phase 2: Step 5 Determine


MTD and prioritization (781)
Now we need to prioritize and calculate the maximum time
we can survive without the business processes identified
in Step 3. This maximum time is called the Maximum
Tolerable Downtime (MTD)* here are some common
MTD classifications.
Keep in mind when prioritizing things, we have to use
quantitative and qualitative analysis to determine just
what is critical. For example loss of some process might
not cause immediate financial loss, but could damage
reputation or competitive advantage, and that damage
could be devastating.
(more)

BCP Phase 2: Step 5 (782)


Here are some common MTD classifications
that you should memorize*
Crititical: 1 4 hours
Urgent: 24 hours
Important: 72 hours
Normal: 7 days
Nonessential: 30 days

BCP Phase 2: Step 6 - Threats


Now we need to identify vulnerabilities and
threats to these processes and the
resources that are required for them.
(remember Risk Management/Risk
Analysis!
On the next slide we will examine some
example threats.

BCP Phase 2: Step 6


Some examples are:
Equipment malfunction
Hacking
Failure in utilities (power, WAN connections)
Critical personal becoming unavailable
Vendors going out of business
Data Corruption
Physical Damage (hurricane, earthquake)

BCP Phase 2: Step 7


Determine the probability/risk for each
business function.

BCP Phase 2: Step 8


Once we have done this research, we must
document and provide our findings to
management. Note at this point we really
have not started creating a Business
Continuity Plan yet, Weve just done the
research. Once Management reviews
findings and gives the OK to proceed, we
will actually develop the plan*

BCP Stage 3: Identify Preventative


Controls (786)
Pretty Straightforward, though a lot of work.
Now that we know what we need to
protect and the threats involved. Look at
ways to PREVENT these problems from
occurring, so we never have to worry
about dealing with them. This is really just
doing a Risk Analysis and determining
Cost Effective Countermeasures.

BCP Phase 4: Recovery Strategies


(788)
Ok now we are at the stage where we
actually are developing a PLAN for
business continuity. Before was just initial
research and getting management to give
us the OK to develop a plan.
(more)

BCP Phase 4: Recovery Strategies


(787)
A more technical and tangible stage. The idea is
to figure out what the company ACTUALLY
needs to do to be able to recovery the necessary
business processes in the event of a
catastrophe.
Determine the most cost-effective* recovery
mechanisms
Formally define the activities and actions that will
be implemented and carried out in response to a
disaster.
These Strategies will be based on the 5 main
business considerations listed on the next page

Phase 4: Recovery Strategies (787)


5 categories
Business Process Recovery
Facility Recovery
Supply and Technology Recovery
User Environment Recovery
Data Recovery
We will go into more detail on each of these
categories coming up.

Business Process Recovery (788)


A Business Process is a set of interrelated steps
linked through specific actives to accomplish a
specific task. For these processes the team
must know the components of the process
including
Required roles
Required resources
Input and output mechanisms
Workflow steps
Required time for completions
How this process interacts with other processes

Facility Recovery (788)


Facility Recovery is concerned with the ability to
move processing operations to an alternate
facility in case of the failure of the main facility.
We can have multiple method to deal with this
including
subscriptions services with service bureaus
Reciprocal Agreements
Redundant Sites
Lets looks into each of these more

Facility Recovery (791)


Subscription services
A subscription service is a contract with a 3rd party
to provide access to a facility. There is generally
a monthly fee to retain the right to use the facility
along with a large Activation fee and hourly fee
when actually using the facility. This is obviously
a short term only solution. There are 3 types of
subscription services which we will talk about
more of in the next slides
Hot Site
Warm Site
Cold Site

Hot Site (790)


Hot Site a facility that is fully configured and
ready to operate in a few hours. The only
resources missing from a hot site is the actual
data and the actual employees.
Hardware and software MUST be fully
compatible or its pointless
- Very Expensive
- Vendor may not have customer specific or proprietary
hardware/software
+ can allow for annual testing
+ ready within hours

Warm Site (790)


A facility that is usually partially configured with some
computing equipment, but not the actual hard core
hardware. I.e. a hot site without the expensive stuff.
Generally can be up in an acceptable time period.
May be better for customers with specific
hardware/software needs, customer will bring computing
hardware with them.
Most widely used model
+cheaper
+available for longer timeframe due to reduced costs
+ good if you have our own custom hardware/software
- takes longer to prepare
-actual yearly testing not generally possible

Cold Site (790)


Supplies basic environment, (AC, electrical,
plumbing etc), but NO actual computing
equipment. Can take a while to activate.
+cheaper
+available for longer timeframe due to reduced
costs
+ good if you have our own custom
hardware/software
- May take weeks to get activated and ready
- Cannot do yearly tests

Reciprocal Agreement (793)


RA also called Mutual Aid is when two
companies agree to help each other out in
the case of an emergency. Ultimately this
is not really practical for most business.
Can you guys tell me what the Pros and
Cons of this are? Can you tell me why this
is not really practical.

Redundant Sites (794)


Pretty much these are HOT sites, that are OWNED
by a company (rather than a service bureau).
This also may have live or slightly delayed data
backups and some staff.
- VERY EXPENSIVE (duplicate costs except for
personnel)
+ best solution if turn around time and ability to
recover all processing aspects are required

Multiple Processing Centers (794)


Another approach is rather to than have only one
center that facilitates a certain business function.
Split the work among multiple active centers
such that there is no single point of failure.
Solid approach
Good Scalability for normal business growth
Just make sure that the other centers have more
resources then they individually need in case
they need to take on more work, due to the
failure of another center.

Supply and Technology Recovery


(795)
Ok so we have plans to recover our facilities and
our main processing requirements. But what
about the lower level of things
Hardware Backups
Software Backups
Documentation
Human Resources
These considerations need to be taken into
consideration too we will briefly talk about these
in the next few slides

Hardware backups (796)


Ok so we have a space to process, but unless we
have a hot site or redundant site, and our
building is destroyed where do we get the
servers from, what about the desktops that our
staff need? Do we have a vendors to provide
these, how long will it take to get new equipment
from them? What happens of we have legacy
equipment what do we do?
We need to take all of these questions into
consideration when planning.

Software Backups (797)


Like the hardware backups, but specifically
about hardware. How do we get copies of
the software, how to we roll out installs.
What about licensing?
What about custom software that we had
created that we cannot just go out and buy
at the store?
Software escrow what is this? Anyone?

Documentation (798)
OK so we have the equipment and software how
do we get it all rolled out and configured such
that it was the same at the company.
Incorrect configurations COULD cause
compromises in integrity or confidentiality!
(how?)
Do we even how our old network was configured?
Can we reproduce it?
An Important concept for BCP that should be in
company policy is that All documentation should
be kept-up to date and properly protected

Human Resources (799)


What happens if our backup facility is 250
miles away? How do we get people there?
What happens if the disaster was a natural
catastrophe and some important
employees are injured or worse what do
we do now?
Executive Succession Planning what is
this?

End User Environment (800)


How do we notify the users about a disaster and
the change of operating procedure?
Once there we need to have some type of people
on the ground directing issues pertaining to
employees. These people should be easily
identified.
We also need to be concerned on how to manage
other tasks that we might not have the resources
to do in the traditional manner. (example
automated data processing, or normal
communication methods) How do we handle
that. The BCP team needs to consider these
types of issues.

Data Backups (801)


How do we ensure we have data to load
back into our new offsite systems? Data
changes constantly. We need a solution
that makes sense and is cost effective
(this will vary business to business).
We will talk about traditional backup types
as well as electronic vaulting on the next
few slides.

Traditional Backups (802)


Traditional backups have some method of backing
up files to a removable medium. The first things
to understand about backups is the archive bit.
Every time a file is altered the archive bit is set
to notify the system that a file may need to be
backed up. Now lets talk about the 3 backup
types
Full
Differential
Incremental

Full Backup (802)


Simply put,
backup every file on the system!
Then clear the archive bit of each file
This must be done to some degree of regularity,
depending on the business needs.
+ everything gets backed up
+ if you do a full backup every day, you can restore
with only 1 restore operation
- Takes a long time, can be expensive to complete in
a timely manner

Differential (802)
Backup any file that has changed last full backup. Steps
are

Find any file where the archive bit is set

Backup the file

DO NOT clear the archive bit


This allows you to quickly restore data in the event of a
disaster in 2 operations. Simply
1. Restore the last full backup
2. Restore the last differential backup
(more)

Differential Pros/cons (802)


Pros
Faster than a full backup
Can do a full restore with 2 operations restore
the last full backup, restore the last differential
backup
Cons
Does not have all data on any tape, you still
need a full backup to do a complete restore

Incremental (802)
The idea is the backup any file that has
changed between the last full backup OR
the last incremental backup. Steps are
Find any file with the archive bit set
Backup that file
Clear the archive bit
(more)

Incremental Pros/Cons (802)


Pros
Fast to backup nightly
Cons
To restore requires many operations, restore last
full backup, restore every incremental backup
done since the last full restore. (restores are
slow)
If you lose any of the tapes (full or incremental)
you cannot truly restore all data.

Which backup is right for you


It depends on your needs.
Personally I believe in the following strategy
If you can do a full restore every night.. Do
so
If you cannot, then move to differential
If you cannot handle differentials move to
incremental
REMEMBER, for all these to work you still
need a full backup periodically.*

Discussion of backups
Can you mix differential and incremental backups?
(Why or Why not?)
All backups should be stored both onsite and
offsite (why)
When storing offsite, would the next building over
be appropriate?
There should be a clear written process on how to
restore files (why)
Someone should periodically test the backups by
performing restores to a test system (why)

Discussion of Backups
What situations would a full backup be
appropriate
What situations would a differential backup
be appropriate
What situations would an incremental
backup be appropriate

Discussion of Backups
When choosing an offsite storage facility
think of the following
How fast can I get access to my data
What are the hours of the facility
What are the access control protections
the facility provides (why do I care?)
Is there fire suppression systems
Are there environmental controls

Non Backup Terms that should be


mentioned (804)
Disk mirroring / shadowing coping data to
one or more hard drives such that a
system has a multiple copies of data in
case of a drive failure
Disk duplexing- same as shadowing, but
using multiple disk controllers.. (why?)

Electronic Vaulting (804)


Electronic Vaulting* is the idea of sending all
changes to a file to a remote site (using
non-backup methods). This usually is not
done real-time but in batches.
(example bank transactions might be copied
daily to another office)

Remote Journaling (805)


RJ is another method of transmitting data to an
offsite facility. However it is different than EJ.
It is done in real-time (What do I mean by that)
Entire files are not copied, only changes (deltas)
to files. (also called transaction logs)
From the base files and the records of changes
you can recreate the current environment.

Tape Vaulting (806)


A type of backup, however rather than
backing up to a local device you back up
to a remote device.

Phase 4: Restoration Strategies


(809)
Now that we covered recovery strategies we
need to look at a couple of recovery
concepts that we will need to understand
in the planning stage.

Phase 4: Restoration (809)


When planning we must also recognize that there
are 3 different teams in DR.
Damage Assesment team assess the damage.
Restoration team responsible for getting the
alternate site into a working functional
environment
Salvage team responsible for starting the
processes of recovering the original site and
moving from the backup site. (cannot stay in the
backup site forever ;)
Lets look at these in the next slides

Phase 4: Recovery (809)


Damage Assesment
Determine cause of disaster
Determine potential for further damange
Identify affected business functions and assets
Indentify resources that must be replaced
immediately
Estimate how long it will take to bring ciritical
functions online
Determine whether the BCP should be put into
operation

Phase 4: Recovery (809)


Restoration Team should be responsible
for getting the alternate site into a working
and functioning environment

Phase 4: Recovery (809)


Salvage Team responsible for starting the
recovery of the original site.
When moving things back to the original
site the most critical functions should be
moved LAST* (why)
The least critical functions should be
moved first.

End of Phase 4: Recovery

Phase 5: Plan design and


development (814)
Now we need to actually come up with a goals and
a plan for attaining these goals. These goals
must contain certain key information.
Responsibility who are the individuals
responsible for what. What is exptected of them,
how will they be trained
Authority in times of crisis who is in charge.
Priorities What are the crictical processes,
what are the priorities.
Implementation and Testing how will we
implement our plans, how will we test it.
(more)

Phase 5: Plan Design and


Development (814)
Strategies
Copies of the plan need to be kept in one
or more lcoations. (why)
Plans must be in paper and electronic
format
Call tress should be implemented

BCP: Phase 6 Testing (816)


OK so we have this great plan that weve spent millions of
hours and dollars creating.. But does it work, or will it
sink and completely fail well we should try testing it.
Testing it also allows us to see where the plan can be
improved, or if new changes in environment will require
the plan to be updated (what company doesnt change
and grow?)
Testing should be carried out at LEAST once a year.*
Any problems that occurred should be documented and
reported to management.*
So what are some testing methods?... Next slide

Checklist Test (818)


BCP is distributed to departments and
functional areas for review. The Managers
read over and indicate if anything is
missing or should be modified. (Manager
checks off that the plan is OK for their
department)

Structured Walk-Through (818)


Representatives from each department
come together AS A GROUP, they walk
through the plan and different scenarios
from beginning to end to make sure
nothing is left out.

Simulation Test (819)


A specific scenario is propose, all required
employees come together and start to
simulate that the event has happened and
start taking action to recover. The idea is
to see if any problems come up or if any
concerns were left out.

Parallel Test (819)


Some systems are moved to the alternate
site and processing takes place. The
results are compared to the real
processing to see if anything needs to
change.

Full Interruption test (819)


Most intrusive test.. The original site is actually shutdown
and processing is moved to the alternate site (really
needs to be a hot site). The recovery team fulfils its
obligation in preparing the systems and environment for
the alternate site.
This is a full blown drill
Requires tons of planning and co-ordination
These are risky and can cause damage if not managed
properly.
Senior management approval is required due to the risk
involved.*

Maintaining the Plan (819)


Now that we have the plan we need to maintain it!
Systems and processes become out of date and
need constant refresh why?
BCP plan may not be integrated into change
management process (it should be though!)
Infrastructure or environment changes (that
never changes )
Company re-organization, layoffs etc
Changes in hardware or software
Employee turn over
(more)

Maintaining the Plan (819)


We can help keep the plan updated by taking the
following actions
Make BCP planning part of every business
decision!
Insert BCP maintenance responsibilities into job
descriptions
Include maintenance in personnel evaluations
Perform internal audits that include DR and BCP
procedures
Test the plan yearly

Вам также может понравиться