Академический Документы
Профессиональный Документы
Культура Документы
Foundation
User Guide
1
Printed and Published by:
Key Skills
George House
Princes Court
Beam Heath Way
Nantwich
Cheshire
CW5 6GD
The company has endeavoured to make sure that the information contained within this User Guide is correct at
the time of its release.
The information given here must not be taken as forming part of or establishing any contractual or other
commitment by Key Skills and no warranty or representation concerning the information is given.
All rights reserved. This publication, in part or whole, may not be reproduced, stored in a retrieval system, or
transmitted in any form or by any means – electronic, electrostatic, magnetic disc or tape, optical disk,
photocopying, recording or otherwise without the express written permission of the publishers, Key Skills.
2
Contents
Page
Foreword 4
Section 1
Hardware/Software Pre-requisites 5
Section 2
Installation Procedure 6
Section 3
Acronyms 84
Glossary of Terms 90
3
Foreword
Projects are essentially about change – and because managing change is an increasingly significant
fact of business life – project management is an essential key skill in today’s working environment.
Many people are involved in project work, either directly or in a supporting role, and yet they have
never received formal training in the basic techniques which can make the difference between a
successful project and an expensive failure.
For exactly the same reason, the introduction of computer-based project management tools can
lead to disappointing results. Training someone to simply operate a computerised project planning
tool does not make them a project manager – any more than teaching them to use a calculator
would make them an accountant!
Key Skills in Project Management (Fundamentals) is the first step in bridging this skills gap. For
many people it is all the training they will need to enable them to operate more effectively in a
project environment and to make more effective use of their planning software. Professional
project managers will find this course lays a solid foundation on which the other modules in the Key
Skills PM Portfolio will build to provide a career-enhancing programme of learning and
development.
4
Section 1: Hardware/Software Pre-requisites
For optimum performance, you should operate this multimedia course on a computer with the
following minimum specification:
5
Section 2: Installation Procedure
START.EXE will run the course directly from your CD-ROM drive and no runtime files will be copied
to your hard disk drive.
The first time you run the course you will be required to register it with Key Skills. Please follow the
accompanying registration instructions carefully
Subject to bandwidth and licensing terms, this multi-media training course can be installed and
operated over a local area network or a corporate Intranet.
There are a number of ways in which installation and operation can be effected and you should
contact Key Skills Technical Support Section for advice.
6
Section 3: Operating the Software
To start the course, double click on the course icon and the program will commence, with music
and introductory title screen.
Once you have passed the title screen and copyright notices you will be asked to identify yourself
to the system.
If you are new to the course you must enter your name/identification and then confirm this to the
system. If you have used the course previously be sure to use the same name, otherwise your
bookmarks within the course will be invalid.
Once sign on is completed you will be presented with the main menu which looks like this:
Each topic is represented by one of the “panes” on the menu screen, for example:
Each of these lesson panes is divided into two distinct areas. If you click on the lesson title text
then you will be taken to the start of the corresponding lesson.
The left side of the pane is the bookmark area and a pink bar will appear in this area to show
whether you have part or fully completed the corresponding lesson. By clicking in the bookmark
area you will be taken to your last point of study within the corresponding lesson.
Note: The bookmarking system is switched off as soon as you move around the course using
either the Index or the Contents buttons at the bottom of each page.
7
Throughout the course, the main user controls are located at the bottom of the screen, and their
functions are as shown below:
Newcomers to the course will gain most benefit from starting at the beginning of the first lesson
and working their way through, sequentially, to the end. However, the package is also a valuable
source of reference and it is possible to re-visit specific lessons, or parts of a lesson, at any time.
The Contents and Index facilities are particularly useful for browsing in this way.
8
Lesson 1a Introduction
Section 4: Course Notes
9
Lesson 1a Introduction
This transition point is also known as the And finally, Peopleware, this includes skills sets,
implementation point, and it can vary details of training products, documentation of
depending on organisational structure and both products and services, Working practices
policy. For example a development team might and general procedures.
retain project responsibility until the end of a
warranty period, at the end of which they hand To deliver effective services to business, all
over the completed project, and associated three infrastructure components should be
ownership, to service management staff. managed and controlled efficiently.
ITIL defines a major process to handle the The management of Hardware and Software is
complex relationships which affect projects, and dealt with in a separate ITIL guidance volume
this is known as Application Management. called ‘ICT Infrastructure Management’.
Application Management considers the whole Our focus in this course is the management of
‘cradle to grave’ lifecycle of an application, ‘Peopleware’, its documents and procedures,
considering issues from feasibility through and how it relates to Service Support and
productive life and final retirement of the Service Delivery.
application. It considers applications as
‘strategic resources’ that need to be managed
throughout their life, understanding the
implications that decisions made at one stage
has on later stages.
10
Lesson 1a Introduction
What does ITIL regard as a Service Five of these disciplines relate to service
delivery.
We all encounter business services in our
everyday lives. Placing an order for goods or These are:
services for example, or when checking into a • Service Level Management
hotel, we are being offered a business service. • IT Financial Management
• Availability Management
In most cases businesses are underpinned by IT • Capacity Management
services. The IT service consists of a set of • IT Service Continuity Management
related functions provided by IT systems in
support of the business, and is seen by the Day to day Service Delivery functions might
customer as a coherent and self-contained consist of technical support, and pro-active
entity. long-term planning of services.
A key phrase in the definition of IT services is The remaining six disciplines make up the
‘end to end’. Broadly speaking ‘end to end’ Service Support function.
means that we deal with all aspects of the
service, its documentation, its support, the These are:
application software, its networks, hardware • Service Desk
and so on. Obvious examples of IT services • Incident Management
might include e-mail, payroll and order • Problem Management
processing. However, there are other less • Change Management
obvious IT services, and these could include a • Release Management
wide area network or a UNIX server, or a • Configuration Management
customer database forming part of a service
support IT system. All six disciplines relate to the day-to-day
maintenance of a quality service.
The ITSMF’s ‘little ITIL’ book defines Service as:
Ten of the eleven disciplines support the
‘An integrated composite that consists of a Process Management discipline, with exception
number of components, such as management of one, and that’s the Service Desk function.
processes, hardware, software, facilities and
people, that provides a capability to satisfy a Service Desk is seen as a function. Every
stated management need or objective. organisation will have this function in place,
operating a Service Desk, employing service
The core ITIL processes are made up of eleven desk staff and managed by a service desk
disciplines. manager.
11
Lesson 1a Introduction
For example, Service Level Management deals We have discussed how ITIL’s flexibility allows
with the provision of high quality services, easy integration into a recognised quality
provided at the right cost levels. Consequently system, such as ISO9000.
it interacts frequently with IT Financial
Management. We looked at the relationship between service
management and the business organisation,
Interaction between other functional and how ITIL defines Application Management
departments might be less frequent. For as a major process designed to handle these
example, Capacity Management and IT Service complex relationship.
Continuity Management might work together to
develop a cost effective and workable strategy We looked at the ICT infrastructure and its
to handle a major disaster, such as a flood. three constituent components, Hardware,
Software and Peopleware. We highlighted
In this scenario, Information on available Peopleware, its documents and procedures as a
capacity at a remote site or location would be primary focus of this course.
provided by Capacity Management.
We defined ‘What a service is’ in IT terms, and
The pre-determined level of support required examined some less obvious examples of ‘IT
for on-going business function would be services’
managed by IT Service Continuity Management.
And finally we looked at the eleven disciplines
These 11 disciplines and the relationship which form the core ITIL processes, and the
between them form the basis of this course, interactivity which exists between them within
and are the subject of the ISEB and EXIN IT Service Management.
examinations, leading to certificates in
Foundation IT Service Management.
12
Lesson 2a Service Desk
13
Lesson 2a Service Desk
Why Have A Service Desk? a problem with the IT Services that they use
will be both disruptive and costly. An effective
The establishment and operation of an effective Service Desk will significantly reduce the
Service Desk is a relatively expensive likelihood of such problems.
proposition. So it is important to understand
why such a facility might be needed and the A further consequence of this will be that the IT
benefits that it should provide. users will in turn be able offer a better level of
service to the external customers of the
The principle of a “single point of contact” that business.
we have already mentioned is considered an
essential element of ITIL Best Practice. This factor becomes even more crucial in an e-
business context where the lack of service will
The users of our IT services and their managers directly impact on end-customers and certainly
are customers in every sense of the word. lead to loss of business.
Like all customers they would quickly become Finally, another major benefit that a Service
frustrated and unhappy if they were unable to Desk brings is its contribution to the principle of
find somebody who could help them when they continuous improvement of the services offered
had problems with the systems on which they by IT.
depend.
The Service Desk will keep records of types of
So customer satisfaction and retention can also enquiry, the issues that are raised, the
be listed as an important benefit. particular services, or aspects of a service, that
seem to cause most problems and so on.
Another guiding principle of ITIL is that IT
should maintain a focus on the support of Identifying the most commonly occurring
business goals. IT does not exist just to problems and feeding this information back
provide ICT components or technology just for quickly to the IT Service Management structure
the sheer joy of playing with new equipment. is a critical aspect of the Service Desk.
It is there to help the organisation achieve its In this way, the Service Desk is the
business objectives. A well-staffed and efficient thermometer by which we can monitor the
Service Desk is a critical element in proving to health of the IT services that are being
the business that IT is listening and responding provided.
to their needs.
Additionally, the service desk can also operate
An efficient Service Desk can help to reduce the as a “shop window” – adding value to the
overall cost of ownership of the IT department, business by making users aware of facilities
and it can do this in a number of ways. that they may not know exist – or how to make
better use, in a business sense, of the facilities
The alternative to a Service Desk is for each that they are already using.
group of users to have their own “super-user”,
to whom they can turn when things go wrong. Points of Contact
ITIL strongly suggests that IT costs can be There is often some confusion about the terms
reduced by not requiring high levels of IT skills “user” and “customer” – so far in this course we
within the business community, and by making have used the words interchangeably and for
it obvious to all how support can be achieved many people they mean pretty much the same
very quickly via a centralised Service Desk. thing.
Making better use of skilled and expensive IT ITIL, however, does draw a distinction between
staff can also reduce costs. Straightforward the two terms.
issues can be resolved immediately by the
Service Desk, leaving skilled network A User – or End-User - is taken to mean the
technicians or database experts, for example, person who actually uses the product or service
to concentrate only on the complex problems or under discussion. A machine operator for
concentrate on improving the quality of the example.
infrastructure.
A Customer is the person who negotiates for
It will usually be the case that the users or the provision of the product or service, what the
customers are performing a valuable function specification should be, any changes that may
for the organisation. So, any time that they are be needed and possibly the payment
unable to operate at full efficiency as a result of arrangements.
14
Lesson 2a Service Desk
It may well be that the User and the Customer chances of a problem being resolved directly
are the same person. But in many cases, for and immediately at the desk.
operational systems, they will be different
groups of people. Customers normally being Here, we’ll assume a Service Desk is the single
managers, and users being the operators. point of contact for just Information and
Communications Technology issues.
These definitions are relevant here because
whilst the Service Desk is the main point of So, as the single contact point, the first duty of
contact between the User and the IT service the Service Desk is act as the IT users “friend”
provider, the Service Level Management within the IT department.
process is main point of contact between the
paying customer and the provider. This particularly relates to the role of the
Service Desk in:
In both cases the key point of reference is the
IT Service itself – as defined in the Service • Monitoring progress on incidents and
Level Agreements – which will contain queries
statements about hours of availability, time to
resolve issues, response times and so on. • Reporting this progress back to the user.
The importance of this to the Service Desk is • Chasing any experts that have been
that they must be aware of what Service Level assigned responsibility for resolving an
Agreements are in place and how these match issue.
up with the question, complaints and issues
that may be being raised by users. • Keeping an eye on any Service Level
Agreements that may specify maximum
It may well be for, example, that a user calls in acceptable response times for resolving
complaining of a 2 second response time – user issues.
when in fact the Service Level Agreement
specifies that 95% of responses should be As the user’s friend, the Service Desk has the
within 4 seconds. responsibility of communicating with the user,
both Reactively and Proactively.
Such an incident would be given a much lower
priority than had the figures been reversed. Reactively being in response to issues,
problems and queries raised by the users and
So, the general point is that Service Level ‘proactively’ being where the Service Desk goes
Agreements provide the link between the out to make users aware of issues that might
Customer, User and Service Level Management affect them.
relationships and that the Service Desk has a
responsibility to act on behalf of the User within It is not uncommon, for example, for the Service
the IT infrastructure. Desk to publish regular electronic newsletters to
the user community informing them of new
Service Desk as a Single Point facilities, changes to services and so on.
So staff within such an organisation could call • Appropriate technology, such as automatic
the Service Desk if the lift broke down, or a call distribution equipment and knowledge-
light bulb in their area failed, or if they had a based systems that assist in identifying
query on their pension arrangements. solutions to problems.
This kind of Service Desk has the disadvantage • Enough technical competence to address
of demanding a very wide range of skills to be users’ problems directly or to interface with
available – which normally implies a referral technical experts if necessary.
system being used – which in turn reduces the
15
Lesson 2a Service Desk
In addition, the Service Desk must have all the Centralisation has the benefit of providing
necessary linkages with other ITIL disciplines. consolidation of management information and
improves utilisation of resources – and
For examples, there must be continuous therefore can reduce operation costs.
communication with the Problem Management
process – particularly when a major problem There are dangers, however, in that a perceived
has cropped up. loss of local knowledge may tempt local sites to
set up their own super-users or unofficial help
There will need to be liaison with Service Level desks.
Management so that potential breaches of
Service Level Agreements can be recognised. Another major issue with this centralised
approach is the cost of communications.
Configuration Management records will need to
be readily accessible so that, for example, a Particularly in an international context, careful
caller’s IT equipment can be easily identified. planning will be needed, otherwise long-
distance telephone calls could easily drive up
Conversely, the Availability Management the cost of providing the service to
process will be keen to look at Service Desk unacceptable levels.
records of incidents for conducting their own
analyses and as part of their role in improving The Virtual Service Desk is based on the
service availability. concept that physical location is not relevant
and that whilst the Service Desk may be
Service Desk Structure perceived as a centralised point, it may actually
consist of several local service desks.
A debate that always occurs early on in the
implementation of a service desk is how the As far as the local users are concerned they are
desk should be structured, from a geographical contacting a local service desk – but in reality
perspective. their calls may be automatically routed to the
most appropriate desk, based on the proximity,
There are a number of strategies that will time of day, staffing or whatever criteria apply.
usually be considered.
This option is obviously much more demanding
Here for example, each distinct site or region of on the use of technology, particularly telephony
the organisation has it’s own service desk – and re-routing equipment, in order to ensure that
hence can provide local expertise to solve local the whole process appears transparent to the
problems. end user.
There are a couple of obvious disadvantages to The logical extension of the virtual service desk
this approach, such as duplication of resources is what is sometimes called the “follow the sun”
and the maintenance of organisation-wide option.
standards and consistency. Also, lessons
learned in one area may not be passed on to This is widely used by multi-national companies
the others. – or even, these days, by local companies who
want to take advantage of cheaper labour rates
Such problems can be minimised the use of in other parts of the world.
centralised logging of incidents and results and
by establishing a central configuration So a typical “follow the sun” strategy might
management database that is accessible by all consist of a service desk in Australia, operating
the local service desks. between the hours of 6am to 6pm local time
The big advantage of this approach, which is and a second desk in London operating the
local knowledge, will obviously become more same hours local time there.
important the more geographically and
functionally dispersed the organisation’s sites The aim is to provide as close to 24 hour
become. In these situations, the issue of coverage as possible for users in each
language alone may give favour to local service hemisphere with the European service desk
desks. coming on line just as the Australian one is
closing down for the night – and vice versa.
The opposite extreme of the local service desk
is the central service desk, where all incidents So people in Europe requiring support during
and queries are reported to and handled by a the night will have their calls automatically re-
single centralised structure. routed to Australia.
16
Lesson 2a Service Desk
This is in fact a major advantage of this It may even be possible to introduce a degree
approach in that the local desk will tend to be of self-service where users register and track
handling local calls during the period of peak their own incidents without the need for inter-
demand – so that overnight re-routing, and personal communication with service desk staff.
hence long-distance traffic, should be relatively
minimal – but it’s there if needed. Be careful with this one though. It can all too
easily be used as an excuse for the service desk
Of course, “follow the sun” may well be more not playing its role in monitoring and processing
than two service desks, depending on the incidents on behalf of the user as the user’s
location or users, time differences and coverage friend.
required.
Also, be careful with telephone calls. If they
To make this work effectively it is imperative are not handled properly it is possible that the
that information about incidents is replicated or user will hang up in frustration and not re-dial.
shared between the different sites so that the
European Desk, for example, can continue to Hence the information that would have been
support a user with a query that may have been gained about a particular incident or query will
raised with the Australian Desk a few hours be lost. All that would be recorded is that a call
earlier. had been dropped, which in turn will be used as
a key measure of service desk performance.
Although there are some complexities with this
approach, it clearly has many advantages and it Lost calls of this kind are often referred to as
is becoming a very common arrangement for “fugitives”. There’s a problem out there that
multi-national organisations offering 24 hour cannot be investigated because it hasn’t been
/7day a week coverage – particularly those in recorded – and although the user could have
the e-commerce field. been more persistent, the fault is with the
service desk staff and or their technology for
Communicating with the Service not making it easier for them to report the
Desk incident.
17
Lesson 2a Service Desk
Here for an example, in a generic rather than Once things have bedded down it may be
just ICT service desk, calls that cannot be possible to relocate them to more productive
directly handled by the service desk will be areas.
directed to experts in the relevant functional
area. So at the one end of the scale we may have an
unskilled service desk, merely logging and
The percentage of calls that get passed routing calls – and at the other would be an
upwards will be determined by the skill levels expert desk capable of handing most, if not all,
and training of the service desk staff. the conceivable issues at the first point of call.
So functional escalation is the handing over of In between these would be what is often called
responsibility to a functionally more competent the skilled or semi-skilled service desk – and
area, in order to tackle a particular issue. this is considered by many to be the optimal
solution.
Hierarchical escalation is where problems are
passed up the management chain - either Achieving this optimal balance is an interesting
because they are very serious or need higher and difficult task. As we have said, there are
level authority to sanction the resources needed no hard and fast rules.
to provide a solution.
There is a school of thought that says a good
The first level of hierarchical escalation would target is to have about 70% of all issues
normally be to the service desk manager, who resolved at the service desk, without further
is usually the own of the incident management referral. But this will vary considerably
process. depending on the service being offered and the
maturity level of that service.
More serious issues may then go to the problem
manager, with a remit to call together the Whatever skill level is adopted, the use of
necessary specialists to resolve the incident as diagnostic scripts will increase the rate of
quickly as possible. resolution at first call, as will access to
knowledge databases, change schedules and so
Very explicit parameters need to be established on.
to govern hierarchical escalation, otherwise it is
very easy for it to become the norm, rather Service Level Agreements must also be
than the exception, which would clearly be accessible so that work can be prioritised
unacceptable. depending on the SLA clauses.
Service Desk Capability Regardless of the technical skills that are put in
place on the Service Desk, all operators must
Related to the escalation procedures is the have certain basic attributes to make them
general debate about how skilled and capable of suitable for the job.
resolving issues the service desk staff should
be. These will include:
Factors that are normally considered are the • An articulate nature – in particular the
increased costs of employing more highly skilled ability to translate technical information
staff against the improved service to the end- into something that is meaningful to the
users that will almost certainly result. business user. This can be particularly
challenging when dealing with customers
Also this may be a dynamic situation with the who are slow to catch on or who become
optimum skill level changing over time. frustrated, irate or even abusive.
18
Lesson 2a Service Desk
• A good business perspective and As with all business investments - the costs of
understanding of what are the business introducing all of this kind of technology must
critical services. This business culture is be carefully weighed against the benefits that
often helped by recruiting service desk staff they bring in terms of service improvements
from within the business itself. and operational efficiency.
Relevant technology can be categorised into We have seen how the Service Desk’s role is to
two types, telephony and software. act as a single point of contact and the users
friend in IT.
Examples of telephony technology might be
Automatic Call Distribution systems, which We have examined different strategies for
ensure that a bank of service desk operators structuring and resourcing a service desk and
are used in an optimal order and that work is we have seen the skills and attributes that
smoothed out as evenly as possible. service desk staff must have if they are to
operate effectively.
Conference call facilities can be useful in
allowing a second-line expert for example to be Finally we have seen some of the new
included in the conversation with the end-user. technology that can be employed to improve
the efficiency of operation of the service desk.
Computer-Telephony Integration can achieve
major gains in efficiency. An example of this
would be the identification of an incoming caller
based on their telephone number and the
linkage of this with configuration management.
19
Lesson 2b Incident Management
Lesson 2b
Incident Management is more aimed at a “quick
Incident Management fix” or a workaround rather than a longer term
structural resolution to any fault. The priority
Objectives for Incident Management is recovery of service
as quickly and painlessly as possible.
In this lesson we will be examining Incident
Management, which is described in Chapter 5 of Problem Management is more about identifying
the Service Support book of the IT the underlying cause of faults and finding ways
Infrastructure Library. of engineering out these faults in the longer
term.
• When you have completed this lesson you
will be able to: This can of course lead to some conflict
between the two disciplines when Incident
• Define the term Incident Management Management staff are driven get a system back
according ITIL Best Practice. up and running quickly.
This approach led to poor use of expensive However, because the processes are essentially
resources – the IT experts – to a failure to learn similar, many organisations include Requests
lessons from previous incidents. ITIL Best for Change within the scope of incident
Practice processes aim to resolve both of these Management.
issues.
Automatically registered events, such as the
One of the main goals of Incident Management failure of a disk drive or a network connection,
is to restore normal service as quickly as are often regarded as part of normal
possible, with a minimum of disruption to the operations. They are still included in the
business. definition of Incidents though – albeit that the
service to end-users may never be affected.
This has to be balanced against the efficient us
of resources – and the prioritisation of different
incidents that can occur simultaneously.
20
Lesson 2b Incident Management
21
Lesson 2b Incident Management
Whilst all this is going on there are the issues of If a “known error” is generated then in most
ownership, monitoring, tracking and com- cases this will lead to a Request for Change – in
munication to be maintained. order for the underlying fault to be corrected.
Additionally, there will be constant updating of Unless, as we have just said, there are good
the status of the incident as it moves through reasons why we should just live with the
the various points of it’s life-cycle. problem for now because the cost of a short-
term fix is not justified.
All of these are proactive activities carried out
by the incident management staff – which is Once a Request for Change has been through
usually the Service Desk, acting on the users the Change Management process as defined by
behalf. It involves generating reports, keeping ITIL, then this will lead to the release of a
users informed and managing escalations. structural solution to the problem. This will be
a permanent fix to the underlying fault, not just
ITIL standard practice guidance says that all a work-around.
these activities remain with the Service Desk
and the use of to help with automatic status Whilst all this is going on, the Configuration
tracking is very important in the incident Management Database should be being updated
lifecycle. with information about the incident, any
problems and their links to incidents, about any
Finally. Remember that everything should be “known errors” and their links to problems, and
logged as an incident – even if it is a Service about requests for change and their links to
Request ie. a request for a standard operational known errors.
item, such as a password reset for example.,
So an integrated Configuration Management
If the Classification and Initial Support process Database not only contains configuration item
determines that the incident is in fact a Service information but also related support records,
Request then the Service Request procedure such as incidents, problems, known errors,
will be invoked. requests for change, and release records.
Because the request was raised as an incident, The absence of a Configuration Management
however, it will eventually have to be brought Database will make it very difficult to harmonise
back into the incident lifecycle at incident separate incident recording, problem recording,
closure, in order to achieve the close down of and change recording systems.
that request procedure.
We will be looking in more detail at the
In understanding the full lifecycle of an incident Configuration Management process in Lesson 3.
it is important to know what further records and
processes may be generated as a result of an Assessing Priorities
incident.
Assessing the priority of an incident is a very
When an infrastructure fault is first reported it important process that needs to be carried out
is recorded as an incident, either by the Service early in the incident’s lifecycle, since it
Desk or direct to the incident management determines what effort is going to be put into
process by automated support tools. its resolution.
22
Lesson 2b Incident Management
skills to solve the fault is immediately available and SLA threat – Problem Management staff
it may have to be put down the list a little. must be informed so that they can provide
extra support to the Service Desk team.
Another factor affecting priority may be the
existence of a specific statement in a Service
Level Agreement that is threatened by the
Benefits & Problems of Incident
incident. Management
Impact - in this definition, is the measure of the The benefits of and potential difficulties with
effect of the incident on the business. This Incident Management are listed on Page 18 of
could be measured in terms of numbers of the little ITIL book and in Section 5.4 of the
users affected or financial loss for example. So Service Support Manual.
it is important to work very closely with the
business in order to understand the factors that Summary
are considered high or low impact.
In this lesson we have been examining Chapter
Urgency concerns the time scale in which the 5 of the Service Support Manual – Incident
incident needs to be resolved. Management.
For example, a fault with a payroll system that We have seen how Incident Management is
occurs on the 2nd of the month may well be Defined, the scope of Incident Management and
considered less urgent than the same fault the differences between Incident Management
occurring on the 20th. and Problem Management, which is the subject
of the next lesson.
These two factors together dominate the ITIL
model for determining priority. So a high We have followed the main stages through
urgency does not always mean a high priority - which an Incident passes during it’s lifecycle
if the impact is considered to be relatively low. and looked at the records that must be kept
For something to be high priority both the and the need for an integrated Configuration
impact and urgency must be high. Management Database.
As we have already mentioned, Service Level We have also examined the different factors
Agreements can also influence priority. that must be considered in determining the
priority of different incidents, which may be
Lets say that Incident A occurs and that this is competing for limited resources.
the fourth incident relating to a particular
service in the current month.
23
Lesson 2c Problem Management
24
Lesson 2c Problem Management
supplier, then information gained from Problem statements made about availability in Service
Management would be very useful to the Level Agreements.
Contract Management team. They could use
this to help the suppliers make improvements, Ultimately, by redirecting the efforts of an
or in evaluation or analysis of the software or organisation from reacting to large numbers of
supplied service. In some instances they could incidents to preventing future Incidents, you
also revoke the contract. provide a better overall service to you
customers and make better use of the IT
So how do we define the responsibilities of staff support organisation resources.
working in Problem Management? These
responsibilities can be broken down into a Finally conducting Major Problem Reviews.
number of focused areas. These reviews take place after a problem
causing major incident or multiple related
These are incidents have been successfully resolved. It is
• Problem Control the responsibility of the Problem Management
• Error Control process to review, identify and prevent the
• Assistance with handling major problem reoccurring in the future. Additionally,
incidents information from these reviews can identify
• Proactive prevention of problems weaknesses in problem management and
• Providing management information incident management processes.
from problem data
• Conducting major problem reviews These review procedures form part of a ‘Service
Improvement Programme’ a key task for any
Problem Control focuses on transforming ITIL conformant organisation which aims to
Problems into Known Errors. It does this by improve value and quality.
identifying the root cause of the problem and
providing a temporary workaround where So let’s look at some problem management
possible. This process redefines a Problem as a definitions in more detail. Firstly, the definition
Known Error. of a problem, which is ‘The unknown underlying
cause of one or more incidents’.
Error Control focuses on resolving Known Errors
under the control of the Change Management We defined how Problem Control focuses on
Process. The objective of Error Control is to be transforming Problems into Known Errors. A
aware of errors, to monitor them, and to problem only exists from the point of
eliminate them when feasible and financially identification to the point when we have found
justifiable. the reason for the problem occuring. Once this
point is reached the Problem becomes a ‘Known
Error Control has become a common process in Error’.
both the applications development, enhance-
ment and maintenance environment and the New Problem identification occurs when we are
live environment; Normally a service and its unable to find a match amongst the definitions
configuration items are introduced to the live of existing problems, or existing Known Error
environment with some Known Errors. It is records. A Problem Record is then raised. One
important that these are recorded in a ‘Known of the most effective Problem Management
Error Database, so that when related incidents techniques is to match against a number of
are reported in the live environment they can multiple related incidents, and realising that
easily be identified. they have a common underlying cause.
Proactive Prevention of Problems, and Providing These Multiple related incidents are of particular
Management Information from Problem Data concern to Service Managers, as they can
includes techniques such as trend analysis, threaten reliability clauses within Service Level
targeting support action, and providing support Agreements or Contracts. For example, an SLA
to the organisation. Typically 80% of incidents might specify that in any rolling month there
are caused by 20% of the IT infrastructure will be no more than two breaks in service
components. provision, and the duration of these breaks will
be no greater than two minutes. So any train of
This Configuration item information can prove events casing us to approach these parameters
useful when attempting to identify the is a major concern. Hence Problem
underlying cause of incidents. The provision of Management helps by providing a very
management information from problem data to important role in the ITIL Service Management
Availability Management for example, can structure, by providing early Identification of
provide vital information on expected levels of problems, and communicating this information
availability, and as a consequence, influence to relevant management areas.
25
Lesson 2c Problem Management
These are:
• Identification
• Recording
• Classification
• Investigation
• Diagnosis
• Review & Closure
Recording
Once a problem has been identified, a record is
created with a unique identifier, and a link is
generated to any associated records, such as
the incidents that caused it, and also to any
Known Errors to which it might relate.
26
Lesson 2c Problem Management
resolution should be called to the review to Impact describes how vulnerable the business
determine. might be. For example, life threatening, or
merely a small inconvenience.
• What was done right
• What was done wrong? Urgency illustrates the time that is available to
• What could be done better next time? avert, or at least reduce, this impact.
• And finally how can we prevent the
Problem from happening again A problem’s classification may well change as a
consequence of the diagnosis activity. This first
Problem closure is the last of the Problem classification of a problem is described as the
Control Activities and is often carried out ‘initial classification’. For example, what at first
automatically when a resolution to a Known appeared to be a problem with a network might
Error is implemented. However we should point actually be the result of a database problem.
out that an interim closure status can exist. For The problem is then reclassified. However, it is
example, when a Known Error has been usual to retain both the initial and final
identified and a solution put in place, a status of classifications, so that resource allocation to
‘Closed pending Post Implementation Review’ problem areas can be improved.
could be assigned to it in either the Incident,
Known Error or Problem records. ‘Closed Sources of Problem and Error Identification
pending PIR allows us to confirm the We discussed earlier in this lesson how problem
effectiveness of the solution prior to final management works reactively to identify
closure. problems, by checking knowledge bases for
records of problems, Known Errors, changes
For incidents, this may involve nothing more etc.
than a telephone call to the user to ensure that
they are now content. For more serious A proactive activity involves the analysis of past
Problems or Known Errors, a formal review may incidents, and the IT infrastructure as a whole.
be required. For example, analysis might identify that a pre-
existing problem at one site, might reoccur at
Finally, remember an important part of Problem another site, which has a similar server,
Management is to continually monitor its own hardware and software configuration.
progress, and the progress of those technical
support staff that are called in when problem Also involved is the broader analysis of the IT
diagnosis, investigation and resolution is infrastructure itself. The examination of over
necessary. This can be particularly important complex relationships, or single points of
when problem resolution is ‘time constrained’ failure, can identify any vulnerable points that
by a Service Level Agreement. are a potential threat to business.
27
Lesson 2c Problem Management
28
Lesson 2c Problem Management
29
Lesson 3a Configuration Management
30
Lesson 3a Configuration Management
ITIL places Service Level Management at the A typical CMDB should contain information on:
very top of our objectives because it represents
service delivery’s ‘shop window’ to customers • Hardware, Software, Peopleware, and
and users alike. It’s also a service to which related documentation.
guarantees are applied, in the form of Service
Level Agreements. • Services, and the relationship between
Configuration Items.
Service level management is supported by
several Support and Delivery processes, which • Incidents, problems and known errors.
amongst other things, enable Service Level
Management to negotiate and comply with • Changes and releases
SLA’s. This whole support structure is
underpinned by the configuration management • Records at the highest level contain
process. ITIL guidance is explicit on this point
and states that ’without effective configuration • information about the organisations
management we are not likely to effectively hardware, including servers, workstations,
implement the other ITIL processes, and this communications equipment and networks.
will lead us to a failure to deliver a quality
service.’ • Information relating to Software, including
operating systems, application or script
software, or any custom designed software.
• Details about Peopleware, including
information related to IT service staff and
their skills.
And finally,
Because configuration management’s remit is The CMDB is also the ideal place to hold
wider than pure asset management, we tend to incident records, problem records and known
refer to the information that Configuration error records if they are held on separate
Management maintains as Configuration Items systems. ITIL guidance suggests trying to link
or CI’s, rather than IT assets. these databases, so that we can link a record to
any related configuration items. By doing so,
We have established that Configuration future searches on a particular CI will return
Management underpins all the Delivery and information relating to outstanding incident,
Support Processes, and it defines IT assets and problem or known error records.
services as Configuration Items. We’ve also
established that it monitors the inter- In the change and release section of the CMDB,
relationships or linkages between CI’s. So how we may hold requests for change, change
does Configuration Management store, manage records and so on. This information is used for
and update this information. It does this by tracking the progress of change and release
entering all this information into a Configuration records. A release record will contain
Management database or CMDB. information about a number of related CI’s,
31
Lesson 3a Configuration Management
which make up a new release, and will describe Again records relating to the contents of both
how to achieve a change defined in the change the DSL and DHS are held in the Configuration
records. Management Database.
A CMDB can offer great benefits to an Also worth noting here is the management of
organisation. However the benefits might not be software licences. This has become a major
immediately obvious to senior management, issue for many organisations, and the
who might suggest that a simple asset repercussions of illegal software use can be
management system would be sufficient. severe, so it’s considered good practice for
However, asset management only addresses configuration management and release
higher value issues in the infrastructure and management to work jointly on this process. In
doesn’t examine it to the same level of detail. a fully ITIL implemented organisation, the
configuration management team would be
Perhaps more importantly, asset management expected to hold information about licences,
systems wouldn’t contain the linkages to what they contain, and what it covers, as a CI
incident, problem, or known errors, or to in the CMDB. However, as with the DHS and
change and release management records, and DSL the physical licences might be held in a
critically wouldn’t document the relationships separate repository.
between CI’s and asset records that a CMDB
would. ITIL suggests that Configuration Management is
made up of five sub-processes.
We briefly defined earlier in this lesson what
constitutes a CI, and ‘ITIL’ defines a These are:
Configuration Item as ‘any component of an IT • Planning
Infrastructure, including a documentary item • Identification
such as a Service Level Agreement or Request • Control
for Change, which is, or is to be, under the • Status Accounting
control of Configuration Management and • Verification
therefore subject to formal Change Control’.
Planning is carried out at the beginning of any
CI’s will vary in type, distinguishing between process to establish a configuration
hardware, software and documentation, and in management plan, and should be revisited
some circumstances, will sub-define lower level regularly.
configuration item records. For example
hardware type might be made up of The processes of Identification, Control, Status
workstations, servers, network equipment and Accounting and Verification are on going.
so on.
Let’s look at each of these processes in a little
Whatever the CI type, it will require a unique more detail.
form of identification. Firstly, a unique
identifier, which should comply with a The first of the Configuration Management sub-
predefined configuration policy. Also an ID type, processes is planning. ITIL suggests five key
which categorises the item into hardware, points which should be addressed in planning,
software, peopleware and so on. Other and these are:
common CI attributes might include a
manufacturers or developers id, its location, • Strategy, policy, scope and objectives
purchase date etc.
• Processes, procedures, guidelines and
In addition to the CMDB, Configuration responsibilities
Management has linkages to two other
information repositories. These are the • The relationships with other ITIL processes
Definitive Software Library or DSL, and the
Definitive Hardware Store or DHS. • The relationship with other parties carrying
out Configuration Management
The DSL is the safe storage area for trusted
software, and is managed by the Release • And finally tools and other resource
Management process. requirements
The DHS houses spare parts for critical We start by defining a strategy. For example,
equipment, and replica configuration models in an organisation might want to establish a
the IT infrastructure. For example the DHS Configuration Management system, but for its
might contain a fully configured standard server ‘live systems’ only.
and workstation.
32
Lesson 3a Configuration Management
33
Lesson 3a Configuration Management
information would show its linkage to its parent, ‘Connection’ describes the relationship between
and also a ‘used by’ relationship to other CI’s. It hardware items.
would not be helpful to lose this level of detail The relationship between a LAN and a server for
by incorporating details into the parent CI. example.
Documenting these linkages in the CMDB can ‘Usage’ describes the interdependency between
have a huge impact on database size. Each new application usage of a common software
CI added might identify three or four linkages. module, or the linkage from one category to the
It’s good practice to establish in advance the other.
required levels of CI’s in the database, even if
we don’t initially populate the database to this Finally having identified and documented
level. With most CMDB tools, it’s far easier to information about CI items, items should be
have empty elements in the database, than to labelled. These might exist in electronic format,
have to restructure the database at a later or might be printed labels which we apply to
date. identify the relevant CI’s.
Version Identification needs to address the full The third Configuration Management activity is
lifecycle of the Configuration Item, so, in Control. The control of configuration items
addition to those items already in the live consists of three sub processes. These are:
environment, items in development and Register, Update and Archive. An additional
awaiting release are also included. At the same function of the control process is to protect the
time version numbers are assigned. These integrity of configurations.
numbers should be monitored carefully. If for
example the development department assign CI’s are registered as they fall into the remit of
their own version numbers, then it’s important IT service management. If we receive new
that this information is transferred to the CMDB equipment from an external supplier, at the
at the point of handover. point of handover, we should establish that
information received from the supplier is
In defining the inter-relationships between CI’s, accurate. In many organisations this activity
there are a number of typical ‘types’ which can has a direct link with procurement.
be used. The most frequently used in ITIL good
practice are Composition, Connection and There are many reasons for updating a
Usage. configuration items status. For example, a
change in the CI’s status from testing to ‘live’. A
‘Composition’ is the simple parent child change of financial asset value. A change of
relationship. A workstation being the parent, ownership, or changes brought about by
the monitor, keyboard or system box being the incidents, problems or known errors. All these
child.
34
Lesson 3a Configuration Management
updates have to happen under the authority of that a request for change on a configuration
the configuration management process. item was properly authorised.
Archiving decommissioned CI’s takes place The fifth and final configuration management
when a component is no longer in use. The activity is Verification.
definition of what constitutes a redundant CI,
decommissioning and timing details, would The primary function of Verification, or
usually be specified in a predefined policy verification and audit as it is sometimes known,
document. is to establish that the information in the CMDB
exactly matches the real life environment.
Archiving involves the removal of CI’s from the Configuration management offers little benefit if
CMDB and archiving onto secure storage, and the information that it provides is out of date or
not necessarily the destruction of the record. inaccurate.
The protection process safeguards against This verification and audit procedure should be
illegal changes to CI’s, and procedures are carried out regularly but randomly. Deliberate
maintained so that the CMDB and the avoidance of the change, and configuration
information it contains are secure. Protecting management process is most likely to be
the integrity of the configurations includes revealed by this ‘spot check’ approach. These
security against theft, protection against audits involve checking the physical
unauthorised change or corruption. Enforcing whereabouts of equipment, and installed
access control procedures. Guarding against software. In addition to the regular ‘spot
any environmental damage. Protection against checks’, verification and audit would usually be
viruses, and making back-up copies of the carried out at the following times:
CMDB information, and the secure storage of
these back-ups. • Before a new release, or before the
preparation of a baseline.
Configuration control scope must extend to
‘bought in’ CI’s, such as commercial ‘of the • After a disaster. To establish that our
shelf’ software, sometimes known as ‘COTS’ records are accurate, following a major
packages. By definition this will involve failure in the IT infrastructure.
software licence issues, and we will be
examining this in more detail in the release • Following detection of unauthorised
management lesson. changes to the infrastructure. A single
unauthorised change might be concealing
Importantly, the protection procedures should many others, with the result that the CMDB
be in place for the definitive software library would not reflect the real life situation.
and definitive hardware store.
• And we would usually carry out an audit
The fourth Configuration Management activity is before the live implementation of a new
Status Accounting. Configuration Management database.
ITIL defines status accounting as; ‘The Carrying out a manual verification and audit can
reporting of all current and historical data be a time consuming and expensive procedure.
concerned with each CI throughout its lifecycle.’ ITIL recommends the use, where possible, of
automated verification tools. These tools are
Status accounting allows us to reveal a CI’s able to roam networks and servers, reporting
past status. What has happened to it up to this on installed hardware and software.
point? Its present status, (what state is the CI Interestingly many manufacturers are building
in now?), and its future status. (What plans automated management functions into their
there are for this CI in the future?) PC’s.
This accounting procedure enables changes to It’s also worth remembering that some
CI’s and their records to be tracked, and to verification can be carried out by the service
document changes in a CI’s status, for example desk staff. During calls from users, service desk
the change from ‘live’ status to ‘withdrawn’. It staff can ascertain what hardware and software
can also help us establish ‘baselines’. By are being used, and whether this matches
declaring a status of ‘trusted’ we save all the current configuration item records.
configuration items and relationships as a
baseline. If we encounter problems at a later Finally, it’s worth noting that in many large
date, we can then retreat to this ‘baselined’ organisations, responsibility for the verification
point. Status accounting can also be used to and audit process would rest with a
monitor organisational procedures, for instance, Configuration Librarian.
35
Lesson 3a Configuration Management
The ultimate update authority always lies with And finally we looked at the potential benefits
the configuration management process, but this and pitfalls when implementing configuration
authority can be delegated in the case of management.
incident and problem records. Configuration
management also remains responsible for
updating the CMDB during the change and
release processes, often acting on behalf of the
change and release management processes.
36
Lesson 3b Change Management
37
Lesson 3b Change Management
Any change to the infrastructure involving has lead to a known error and a proposed
software, hardware, services and so on, will ‘structural’ resolution.
result in changes to Configuration Items. As a
consequence Change Management must work Another source of RFC’s is the need for the
closely with Configuration Management. As we introduction of new or upgraded CI’s. For
said earlier, part of Change Management’s example, your organisation has recently
responsibility is the analysis of any proposed purchased new workstations, their installation,
change. To do this effectively it must addition to the network, recognition by the
understand what CI’s will be affected by the server, providing the help and user
change, the way in which constituent CI’s are documentation, will all generate RFC’s.
linked, and if linked, how they make up one or
more services. So Configuration Management We may have a ‘New or changed business
identifies CI’s which are likely to be affected, on requirement for an IT service’, often identified
behalf of Change Management. by the service level review process. Again this
will generate a Request for Change, and be
By exchanging information with Capacity, passed on to the Change Management Process.
Availability and Configuration management,
Change Management is able to ‘Asses the An RFC might arise because of customer or user
overall impact’ of the change. Once assessed dissatisfaction with a current service. This may
we should be able to state; not have been reported via incident or problem
management, and it might not be outside our
The impact is manageable, the cost of change is current Service Level Agreements. However, it’s
reasonable, and business benefits are important, where financially viable, to meet
worthwhile. At this point Change Management customers requests.
‘authorises the change’.
Implementation of new or changed legislation
In many cases this authorisation is with the might bring about an RFC. Particular examples
help of other experts who form a body known include legislative changes relating to privacy,
as the Change Advisory Board, and in some intellectual property rights, security and so on.
cases, where the change is a simple one,
Change Management can be devolved In these A major change in business requirements may
cases it is common for the Change management generate a significant Request for Change. Such
process to be devolved to Problem a request may have already passed through a
Management, or even to operational staff. conventional investment appraisal process, and
enters the ITIL Service Management process for
Throughout the change management process, a second review. The role of Service
there is an ongoing update of information within Management is to ensure full impact analysis
the Configuration Management database. For against effects on existing services, and on the
example, a CI status can now be moved to infrastructure as a whole.
‘under change’, or a new CI is created if we
replace one piece of software with another and Typically, a request for change will contain such
so on. information as the sponsor, the requested date
for implementation, an initial list of
And finally when a change is ready for release configuration items affected, services affected,
to the wider user community, be it effecting the reason for change and initial costing
software, hardware, documentation or related information. The exact content will vary
infrastructure components, it falls to Release depending on the origins of the RFC.
Management to manage the actual physical
implementation. Remember however, that One of the main responsibilities of the Change
overall responsibility for any change remains in Management Process is to establish a ‘Change
the hands of change management. Advisory Board’ or CAB.
The trigger for the Change Management process The role of the CAB is to consider RFC’s, and in
is the receipt of a Request For Change or RFC. the light of the business need make
recommendations as to whether they should be
ITIL defines a number of sources from which an accepted and implemented, or rejected. It also
RFC can be received. The most common and ensures that any RFC’s which don’t merit
well documented are those that form part of the detailed consideration by the CAB are recorded.
incident resolution lifecycle. For example, where The CAB will also advise on the grouping of
a user identifies an incident and reports it to the changes into ‘releases’ to minimise disruption to
service desk staff, who in turn generate an RFC. the organisation and maximise benefits.
Or from Problem Management, which generates
a RFC after investigation of multiple incidents
38
Lesson 3b Change Management
39
Lesson 3b Change Management
40
Lesson 3b Change Management
The definition of minor, significant and major Note that a failure during the change building
will be defined by individual organisations, and process will almost certainly result in the
will be dependent on the current status of the change returning to the CAB, possibly with a
IT infrastructure, and the IT service request to modify the scope of the change. It’s
management personnel’s current feelings about important that all changes have a back out
risk. plan, so that if an error occurs during
implementation, the change can be reversed
A ‘minor change’ categorisation would usually and the service restored. At this point the failed
be authorised by the Change Manager, who will change will re-enter the process at the CAB
report their actions to the CAB after completion level.
of the change. The aim here is to reduce the
number of RFC’s forwarded to the CAB by Once the change is complete it moves to an
filtering out any low risk changes. Independent tester, where the change is tested
and quality checks are carried out. If at this
If the change is defined as either significant or point a failure occurs, the change is returned to
major, then the CAB will have a significant role. the Change Builder.
In both cases, the first action is for the Change
Manager to circulate RFC’s to either the CAB, or If the Change is tested successfully it moves
in the case of a major change, to company onto the Change Manager, who coordinates the
Board or other senior management members. implementation of the change.
As we saw earlier in this lesson, the CAB’s role Remember that the Change Manager has
is to give advice, provide estimates on required overall responsibility for the change, but that
resources and timescales, and put forward Release Management normally has control at a
schedules for change based on priority and detailed physical implementation level.
resource availability. The CAB will also perform
detailed impact analysis, and this often requires Note that throughout the cycle of building and
input from ITSM specialists, for example the testing, and during implementation the
Capacity Manager. Configuration Management process is updating
the status of change records. Typical statuses
Eventually implementation dates and a schedule include; accepted, in build, under test and so
are decided upon, this information is contained on. A change record will typically contain details
in a ‘forward schedule for change’, which is of the back out plan, when it was built, CAB
passed to the relevant service management recommendations and scheduled
staff, and to the business as a whole. If implementation dates. As a consequence, the
changes are likely to cause disruption to the change record is frequently changed.
business, then this will be formally documented
in a ‘Projected Service Availability Report’. It’s important to accurately manage the change
record system within the CMDB, so that we can
Remember, not all RFC’s considered by the CAB carry out traceability tests. Change records are
will be accepted. After investigation, the usually linked to impacted infrastructure
potential risk or financial implications might be configuration item records, and also to any
considered too high, and outweigh any potential related incident, problem or known error
benefits the change might bring. records.
The CAB activities of estimating and scheduling
may well be iterative, and the process If at the point of live implementation the
continues until an approved change status is change fails, then the Change Builder instigates
reached, or the RFC is rejected, in which case it the back out plans. If however, the change is
might re-enter the process at the beginning. At implemented successfully, it’s important that
the point of approval, the Configuration the Change Manager reviews the change.
Manager updates the Change Management
Database. The review process can provide valuable
information about our change management
The change has now reached the Change process, and can also identify vulnerable areas
Building sub process. The Change Builder may in the IT infrastructure. A successful review will
actually consist of several groups of internal or trigger the ‘closed’ status, and the request for
external staff, who are involved in hardware, change or change record will be updated in the
software, operating systems, documentation CMDB. Note the CAB itself might be involved in
and so on. Change Builders are not normally the review process. A failure at the review stage
permanent members of a Change Management would identify shortcomings in the implemented
Team, but are drawn from areas of technical change. This in turn would result in new
expertise. requests for change entering the process.
41
Lesson 3b Change Management
42
Lesson 3b Change Management
The first action is for the Change Manager to We saw earlier in this lesson how the Change
call either a CAB meeting, or in an emergency Manager examines RFC’s and categorises them
situation, the CABEC. The aim of this meeting is as either, standard, using a standard change
to quickly evaluate the request for change, by model, minor, significant or major. To assign
assessing its impact, the resources required and one of these categories, the Change Manager
its urgency. The meeting should establish examines the RFC, and considers the following:
whether it’s urgent status is justified. If the
outcome suggests that the RFC status isn’t Impact
urgent, then it will be rejected, and will be dealt The impact the request for change will have on
with as a standard RFC. the business, considering such factors as the
number of users affected.
If, on the other hand, the RFC status is
confirmed as urgent, then it passes on to the Novelty
next process and in to the hands of the Change Is the change familiar? Has it occurred before?
Building Team. The Change Building Team then Together, Impact and Novelty can provide us
build the change and where technically possible, with some idea about the level of risk involved
prepares a back out plan. with the RFC. A RFC with high impact and high
novelty is certainly a higher risk.
When the change is complete, as much testing
as possible should be carried out. Completely Devolved Authorisation
untested Changes should not be implemented if Has the responsibility for change been devolved
at all avoidable. In this case, the Change from the CAB to the Change Manager? Or
Manager then coordinates the implementation further devolved to say the Service Desk.
of the change into the live environment.
Standard Model
If the implemented change fails, the Change Can the request for change be dealt with via a
Manager implements the back out plan. If the standard model, with a pre-established
change is successful, then the Change Manager implementation process?
firstly ensures that records are brought up to
date, carries out testing in the live So lets add some content to our table, We’ll
environment, and at a later date, reviews the start with column 1.
change. If after the review, the change is
considered successful, then it is closed, and the This RFC is regarded as low impact to the
Configuration Manager closes the RFC and business, and is a well known change, so the
updates the CMDB. novelty is also low. Authorisation has been
devolved to the change manager, and a
Lets take a few steps back, and look again at standard model exists. This is a high frequency
the process, assuming this time we have time RFC.
to test the change. This time our built change
passes from the Change Builder to the Column 2 is slightly different, again the RFC is
Independent Tester who carries out testing as regarded as low impact, but it hasn’t been done
quickly as possible. If tests are successful, then before, so its novelty is high, and as a
the change is forwarded to the Change Manager consequence, no standard model exists. Again
for coordination of implementation. If the authorisation is devolved, and it’s categorised
change fails during testing, then it returns to as a minor RFC. This type of RFC could act as a
the Change Builder process. trigger to build a new standard model.
The Change Management process deals with In our third example, the results are slightly
Requests For Change from many areas of the different. Our RFC has a high degree of novelty,
organisation, and with different levels of and no standard model exists. It will be
authorisation. Where RFC’s are frequent and forwarded to the CAB, so authorisation isn’t
repetitive, they can be dealt with via pre- devolved to the change manager. This RFC falls
existing and authorised processes. These into the significant category.
processes are known as a ‘standard model for
change’. The RFC in our fourth example has a standard
model, however, business impact is considered
Standard models needn’t be solutions to simple high, so devolution to the Change Manager
changes, often complex operations can have won’t take place, and it must be examined by
standard models. In general once a RFC is the CAB before the standard model processes
regularly repeated, we can create a standard are implemented. Hence this is regarded as a
model for that change. significant RFC.
43
Lesson 3b Change Management
As both the impact and novelty are high, the • The number of changes implemented
RFC in our fifth example must also be during the measured period
considered by the CAB. This is also a
‘significant’ RFC. • Number of changes backed out by reason
code
In example six, we are considering a change
which has very high business impact. For • Number of Staff Training records up to date
example, changing from an ISDN based
telephony system to ADSL. Changes of this • Cost per change verses estimated cost
magnitude would normally be authorised at a
higher level than the CAB. It is categorised as a • Number of urgent changes
major RFC.
By auditing the change management process
Finally, lets examine a couple of examples, we can check for compliance to procedures. In
which in general should be avoided. general a change management audit should
investigate:
Firstly, a Change which is regarded as high
impact, but which has devolved authority, this All new software releases
is likely to be considered very risky. Checking that they have been through a proper
authorisation process
Secondly, a change which has no standard
model but is low novelty, should, by definition, Incident Records
have a standard model in place, and shouldn’t Usually selected at random, and tracked
be re-submitted to the CAB. through the change process
44
Lesson 3b Change Management
45
Lesson 3c Release Management
46
Lesson 3c Release Management
before, during and after the move to the ‘live’ Also worth noting is that any back out plans
environment. which have been prepared should also be
tested.
Release Management also agrees the exact
contents of any release and a detailed roll out Part of Change Management’s role is to decide
plan with Change Management, on the particular contents of the release and it
is very important that the release management
The Release Management process encompasses team are fully aware of the decisions that have
three defined areas of the organisation. been made by other organisational elements.
The development area, its own area of pre- Within the actual production environment we
production, and finally the production area, or will have to deal with, distribution, potential
live environment. rebuild and implementation, of software and
hardware releases. There may be three
The migration from one are to the next, is only separate stages, firstly to distribute software,
permitted subject to satisfactory results from secondly, build it or rebuild it in the live
reviews, tests and other appropriate quality environment, and finally implementation.
checks.
Each of these three stages should be verified as
Release management has full responsibility for accurate. For example, before we attempt
the pre-production environment, which contains implementation, we should be absolutely
both the Definitive Hardware Store, or DHS, certain that a rebuild process has been
and the Definitive Software Library or DSL. achieved correctly.
Although we show the DHS & DSL within the
Pre-production area, it is important that it Note that ITIL refers to specific steps called
remains detached from the development, pre- ‘Roll Out Management’ and this may take place
production and live environment. Remember, after independent testing to manage in more
it’s just as important to control a hardware detail the actual implementation stages that
change and release, as it is to manage the follow. Roll out management usually comes into
software equivalent. play when we’re dealing with very large and
complex implementations or ‘roll outs’.
Independent testing might include customers
acceptance testing, operational acceptance Throughout this process it is very important to
tests and so on. It may well be that significant update the CMDB. Information is held here on
customer acceptance testing has already been Release Records, and that any status changes
carried out. However operational acceptance to these records is documented.
tests are very important – they ensure that
anything that goes wrong in the live
environment is supportable maintainable and
robust.
47
Lesson 3c Release Management
Definitive Software Library and the defined as ‘that set of Configuration Items
Definitive Hardware Store. within the infrastructure which is normally
released together’.
Release Management has responsibility for two The general aim is to decide the most
critical repositories. These are the Definitive appropriate Release-unit level for each software
Software Library or DSL, and the Definitive item or type of software. This can be set at
Hardware Store, or DHS. System, application suite, program, or module
level. Different release units will exist in
Information related to the contents of the DSL different parts of the infrastructure. For
and the DHS is held in the Configuration example an organisation may decide that a
Management Database, and responsibility for normal release unit for its order processing
keeping these records up to date belongs to service should always be at system level, and
Configuration Management. as such a change to a CI which forms part of
that system will result in a full release for the
The DSL contains only trusted versions of whole of that system. The same organisation
software, for example software which has been may decide that a more appropriate Release
developed from valid earlier versions via correct unit for PC software should be a suite level, and
Change Management Processes. so on.
The DSL may consist of one disk containing all Once the ‘release unit’ is defined, Release
bought in and created software held in a single management moves on to address the question
format. Commonly the DSL consists of separate of release type. Release types are defined in to
disk volumes or servers containing software for 3 categories, these are, full release, Delta
individual environments. Additionally the DSL release and package release.
could contain other software media, such as
diskettes, CD’s and so on, which might be A full release is where all components of the
stored in a separate cabinet. release unit are built, tested, distributed and
released together. For example, if the release
Software assets are particularly vulnerable to unit is at program level, then the whole
unintended loss or corruption, so it’s important program would have to be rebuilt.
to take very good care of the DSL. For example,
employing adequate security and access If it’s at suite level then the whole suite, which
controls. Appropriate protection against other might include many applications, would have to
threats, such as fire or flood should also be in be rebuilt. Consequently full releases are
place. Backup copies of critical elements of the expensive to build, distribute and install.
DSL would usually be kept, often at another However they do give confidence that all the
location. elements of a service work together
successfully. They are most appropriate for
Finally protecting the DSL against virus major changes, and are usually scheduled over
infection, by running regular virus checks on longer periods of time.
any item entering the library.
Delta releases involves distributing only the
The definitive Hardware Store should be components that have changed since the last
protected in a similar way, and should have release. Consequently this is a less expensive
specific protection against physical removal. option. Delta releases are most appropriate for
The contents of the DHS should be updated as fixes and urgent or emergency changes, and as
quickly as possible to reflect the live such form the most frequent form of release.
environment.
To reduce the frequency of Delta and Full
Storing older versions of hardware can be releases, and to provide longer periods of
useful if the organisation encounters significant stability ‘Package Releases’ can be used. A
problems with new configurations and software, ‘Package Release’ might consist of groups of
then it’s possible to revert back, by cloning delta or full releases, or a combination of the
these older versions. two.
Remember, responsibility for maintaining the Defining Release Type involves deciding on a
contents of the DSL and the DHS is shared form of Release Identification. It’s normal to
between Release Management and use a numbering structure, which applies to two
Configuration Management. or three levels. For example a new Payroll
System might be assigned a release Id of
One of the key activities of Release Manage- V:1.0. An additional minor release which
ment is deciding on the correct ‘release type’. involves changes to some of its applications
Firstly it defines the ‘release unit’, which is
48
Lesson 3c Release Management
A Release Policy might also contain A Big Bang approach involves all sites receiving
all functionality simultaneously. The benefit of
• Guidance on the level in the IT this approach is that it offers consistency of use
infrastructure to be controlled across the organisation. However, achieving a
simultaneous upgrade can be problematic.
• Details on release identification and
numbering conventions In a phased approach all sites could receive
some functionality at the same time, with more
• A definition on major and minor releases, coming later. In a Pilot approach a single site
plus a policy on issuing emergency fixes. receives all functionality ahead of other sites.
Note however that combinations are possible,
• Expected deliveries for each type of release for example a ‘phased pilot’ approach.
We mentioned earlier in the lesson that Release Compliance with software licence agreements
Management is responsible for the detailed has become critical to businesses. Ensuring
planning of releases. Amongst other things, these obligations are met is the joint
release planning involves: responsibility of Release and Configuration
Management. For example, when moving
• Gaining agreement on Release Content software to the DSL, it is important to check
what has been purchased has arrived, that it
• Producing a high level release schedule has been virus checked, and that the licence
agreement has been checked.
• Planning resource requirements
Remember penalties for breaching the laws on
Release planning is responsible for verifying all software theft are applicable to any responsible
of the hardware and software in use is as officer of the company, including those at the
standard, and has been derived from the highest level.
necessary definitive software library and
definitive hardware store. There are many legal precedents for holders of
software intellectual property rights arriving
In addition the Release Planner develops a unannounced at premises, and impounding any
Release Quality Plan, to ensure all aspects of equipment, which they believe, contains
the release are quality managed, and produces unlicensed copies of their software.
a back-out plan
Benefits & Problems
Where a release is going to be particularly
complex it may require a specific planning The benefits of and potential difficulties with
phase. To facilitate this, the Release Plan is Release Management are listed on Page 39 of
extended to Rollout planning. This expands the the little ITIL book and in Section 9.4 of the
Release plan produced thus far, and adds Service Support Manual.
details of the exact installation process
developed and the agreed implementation plan.
49
Lesson 3c Release Management
Summary
50
Lesson 4a Availability Management
51
Lesson 4a Availability Management
In Service Level Agreements and in clauses with So imagine that we have a timeline with time
suppliers through underpinning contracts, running from left to right.
Availability is often expressed as a percentage -
the percentage of the agreed service hours for Now for a particular component, lets say that a
which the component or service is available and failure occurs at time X1. This will be recorded
that is often as a measure of how good or bad in ITIL as an Incident.
the availability is.
There will then be a period of time that it takes
To say that we require 99% availability of the to repair the faulty component – this is usually
service over a given period is a fairly common referred to as the Mean Time To Recover or
way of defining what is needed by the business. MTTR.
So, customers negotiate the SLA availability Be very careful here as the R in this acronym
clauses with the IT service through service level can have a number of alternate meanings. We
management processes and then, as we will be have defined it as “Recover” – but it is also
seeing in later lessons, service level commonly taken to mean “Respond”, “Repair”
management processes require underpinning or “Restore”. Imagine, for example, that the
support. failure is a crashed hard disk.
There are broadly two types of underpinning There will be a period of time that it takes to
support, one through operational level “Respond” to the incident, to get an engineer
agreements with internal suppliers, the other on site. Then there will be a further period
through underpinning contracts with external during which the disk is being repaired or more
providers. likely replaced. Typically, it will then take some
time to “Restore” the data to the point where
In the case of the internal support, such as normally business can be resumed.
application support, hardware support and so
on, then we’ll expect to find statements in the In this course we will be using the term
OLA on availability, reliability and “Recover” to encompass all of this – and the
maintainability of the components that this Mean Time To Recover is the average length of
group is responsible for. time that all of this takes to achieve.
When we are talking about underpinning Be aware though, that it may be useful to
contracts the word ‘serviceability’ is often used understand these other measures as they are
as a contractual term and that is seen as often captured by service management
covering availability, reliability and organisations to check on various aspects of the
maintainability when applied to components availability management process.
supported by external suppliers.
Once normal service has been recovered there
You can review a definition of each of the terms will then be a hopefully long period of time
“availability”, “reliability”, “maintainability” and before the component fails again at time X2.
“serviceability” by clicking on each of the
buttons here. The period of time between the fault being
recovered and the next failure is known as the
The word Serviceability, in ITIL, is reserved for Mean Time Between Failure or MTBF.
use where support is provided by external
parties and will incorporate statements about Hence it is easy to see that the sum of the
availability, maintainability, reliability of their MTTR and MTBF will give what is called the
managed components and services. Again, Mean Time Between System Incidents or
measuring the way the third party suppliers are MTBSI.
achieving availability would be of value to the
organisation and should be part of the role of
availability management.
52
Lesson 4a Availability Management
We can now consider the relationships that All businesses rely on their IT services – but
exist between each of these three parameters some services, or parts of services, will be more
and the terms Availability, Reliability and important to the business than others.
Maintainability that we have already discussed.
For example, in an EPOS service, the critical
It is obvious from the diagram that a high Mean requirement is that we are able to take
Time Between Service Incidents implies high payments. Other functions such as automatic
Reliability. If components don’t fail very often updating of stock levels is important but not as
then the services on which are based on them vital as servicing the immediate customers.
will be reliable services. So high MTBSI is Therefore it may be necessary to aim for higher
obviously a good thing. availability of the first part of the service than
the second part.
On the other hand, a low Mean Time To Recover
is good news, since this implies a high ITIL refers to such business-critical functions as
Maintainability. This can be achieved, not only Vital Business Functions or VBFs
by technical means but by having good support
procedures within the IT service management The concept of Vital Business Functions is
team so that there are no delays between an widely used in IT Service Continuity
incident being detected and repair work Management and Availability Management
starting. within ITIL and is a way of highlighting the
services to which the business must have
As you might expect – a high Mean Time almost 100% availability.
Between Failure is very desirable and directly
equates to a high Availability. Understanding each Vital Business Function
allows the Cost of Unavailability of a service to
So, typically we can see that if we want to be measured and reported. Such costs may be
achieve higher availability, then either incurred through revenue loss, or overtime
increasing the Mean Time Between Failure or payments and so on, as we discussed earlier.
reducing the Mean Time To Repair – or a
combination of the two can achieve this. Cost of Unavailability is a more effective way of
reporting than percentage availability because it
All of these measures, MTBF, MTTR and MTBSI, relates the true cost of the loss of service to the
can be applied at both the component and business directly.
overall service level.
It is important to report on trends and to agree
Typically, if we want to increase the overall on the measurement period, for example,
availability either of a service or of an assembly “Service was available for more than 98% of
of components, then this can be done either by the agreed service hours during the last month”
increasing the reliability of each component or may be very useful when we’re reporting
the resilience of the assembly or by improving against service levels in Service Level
the maintainability and the procedural aspects. Agreements, which are often expressed in the
same way.
If an e-mail service is dependent on two servers
and each has a MTBF of 5000 hours, what will Trends are very important in the whole of
be the MTBF of the e-mail service ? service management. Service improvement
programmes, for example, set out to move
Increasing the MTBSI and MTBF figures and things forward, and that relies on having some
reducing the MTTR will all cost money. There baseline against which to measure.
will be a limit as to how much we can spend to
achieve high reliability and high resilience and So, for example, we might want to say that
there will be a limit to how much we can spend we’ve moved forward in terms of the number of
to achieve instantaneous reporting and repair. breaches of Availability Agreements from last
year to this, with the number decreasing from
As we said at the start of this lesson, the 10 to 5, say.
business can have almost whatever availability
it wants – provided that it is prepared to pay for Section 8.7.7 of the Service Delivery Manual
it. uses what it calls an IT Availability Metrics
Model (ITAMM) as a framework for deciding on
the sort of reporting that needs to be done.
Because it covers such a wide range, from
details of component availability right through
53
Lesson 4a Availability Management
to services, it is a basis for all reporting both management staff having some familiarity with
internal and external. system development processes.
It is beyond the scope of a Foundation course to The Availability Plan should be a long-term plan
understand much more about the ITAMM, just for the proactive improvement of IT service
the fact that it exists and is a basis for availability within the imposed cost constraints.
important reporting is what we need to know.
A good plan should have goals, objectives and
deliverables and should look at all the issues of
Responsibilities of Availability people, processes, tools and techniques as well
Management as looking at the technology.
Page 64 of the Little ITIL Book gives a useful In many ways the Availability Plan is analogous
listing of the responsibilities of the Availability to the Capacity Plan and should take account of
Management process. current levels of availability against the service
level requirements, trends in terms of
The first of these, concerning the optimisation availability, new technological options and
of availability is self evident and much of this knowledge of the way the business is
lesson concerns that particular point. developing.
The second point is about determining There is no absolute guideline on how far ahead
availability requirements in business terms. the plan should look, but following the capacity
management analogy, it would reasonable to
It is very important that we are able to work think in terms of one year at a time with a
with the service level manager and the review at least every three months.
customer so that their requirements for
availability can be expressed in terms with The fifth item on the list of responsibilities is all
which they feel comfortable. about the collection, analysis and maintenance
of availability data. Monitoring the various
They are often much more comfortable with availability parameters can generate a large
discussing business lost, business downtime amount of data and because of this it is not
caused by loss of IT services, than they are in unusual to find an Availability Management
percentages and fractions. Database being created. This may be either as
a separate entity or by adding extra information
Hence we must be able to gather these to Configuration Management database.
requirements in the relevant terms and
translate them into meaningful technical terms Item six is arguably one of the most important
for discussion with suppliers of underpinning areas and defines the role of the availability
services, both internal and external. manager.
Conversely, if we are producing technical This is all about monitoring service availability
information about availability, MTBFs, MTBSIs against the Service Level Agreements, for the
and so on, it is our responsibility to help the benefit of the service level manager.
service level manager to turn these figures back
into meaningful business terms for the The performance of internal and external
customer. suppliers against the serviceability
requirements in any underpinning contracts and
The third point, Predicting and Designing for targets defined in the Operational Level
expected levels of availability and security, Agreements and must also be monitored as part
implies that availability management staff are of this process.
involved in the systems development process
right from the very beginning. The final point refers to the need for the
Availability Management process to be
It is an ITIL recommendation that Availability continually looking for improvements on a
Management staff should be involved when the proactive basis. In other words, not waiting for
business case is being created for a new or targets to be threatened before taking action,
extended service and that they remain involved but to be constantly reviewing current status
all the way through the analysis and design and looking for cost effective ways of improving
process. availability.
The aim being to ensure that the needs of As with many other of the ITIL processes this
availability management, including proactive work is critical but may be the last
maintainability and reliability, are built in along part of the process to be implemented.
with security elements. This implies availability
54
Lesson 4a Availability Management
There is an additional responsibility on the levels in the area of availability, then we’ll be
process owner, and that is to monitor the constantly looking at records of service level
effectiveness and efficiency of the availability achievement or service level breaches or
management processes. potential breaches.
This can often be done by looking at how many Now let’s look at the key outputs from the
SLAs have been breached because of process, which are:
availability issues and looking at how many
components have got measurement in place. • Availability and Recovery Design criteria for
each new or enhanced IT Service. These
The Availability Management are intended to help the development
Process teams decide on how to achieve high
availability.
Section 8.3 of the Service Delivery manual
describes the Availability Management process • Details of the Availability techniques that
in some detail. will be deployed to provide additional
Infrastructure resilience to prevent or
The inputs to the process include: minimise the impact of component failure
to the IT Service
The Availability Requirements of the business,
which are critical. • Agreed targets of Availability, reliability and
maintainability for the IT Infrastructure
A business impact assessment, so that the Vital components that underpin the IT Services.
Business Functions and the consequences of
loss of availability are fully understood. This • Reporting of Availability, reliability and
will help in determining priorities when setting maintainability to reflect the business, User
up the Availability Management processes for and IT support organisation perspectives
the first time.
• The monitoring requirements for IT
Part of the service level negotiation process will components to ensure that deviations in
be to determine the availability, reliability and Availability, reliability and maintainability
maintainability requirements from the business. are detected and reported
Some of these will be for existing services while
others will be for services that are in • And finally, an Availability Plan for the
conception. proactive improvement of the IT
Infrastructure.
Incident and Problem data will also need to be
examined. Part of the proactive work will be to Security
investigate incidents and problems and to see
which of those are caused by unavailable It can be argued that the most valuable assets
equipment and what the impact of these of IT services are the data and the ability to
incidents or problems was on availability process that data.
measures.
This is why security is such an important part of
Configuration data will be very important since IT service management.
that will show the relationships between
configuration items and the chain of The basic logic behind managing these assets
configuration items that makes up a typical is:
service.
• Make sure that access is denied to
This will enable us to look for sensible places unauthorised people. In other words,
where we might decide to replace equipment by maintain Confidentiality.
higher quality equipment with a higher
reliability. • Make sure that the assets are trustworthy.
That is, maintain Integrity.
Or, for other areas where we might decide to
mitigate against a possible single point of • And, make sure that assets are available to
failure, or SPOF in ITIL terms, by looking for authorised people when they need them.
alternative routing in a network or perhaps Or, maintain Availability.
duplicating of discs or processors.
This may lead to some conflict and possible
Remembering that one of the jobs of availability trade-offs. For example, high availability is not
management is to ensure we achieve service
55
Lesson 4a Availability Management
necessarily good if it compromises Contrast this with the value given by the more
confidentiality or integrity. simple basic calculation, which would be only
90%.
Within ITIL, availability aspects are the
responsibility of availability management while Its important to note that whichever way of
the confidentiality and integrity issues are calculating availability is chosen has to be
shared responsibilities with security agreed with the users before it can be used as
management. the mechanism that we measure and report on.
Within an organisation, it may well be that the Percentage availability may not always be the
whole responsibility for CIA is devolved to the most useful measure from a business point of
availability management team. It is very view.
important that such responsibilities are clarified.
Absolute figures of up-time and down-time over
Techniques for Availability an agreed period might be more appropriate
Management and may be more acceptable for the business.
56
Lesson 4a Availability Management
of service each of 10 minutes duration may be There may also be some technical limitations in
more damaging than a single loss of service of terms of how easy it is to switch from one
100 minutes for the same period of time. component to another when one fails, but the
general principle is one of significant
The reporting requirement to cover such improvement to assembly availability achieved
differences will need to be closely examined and in this way.
agreed with the business.
One difficulty in both cases is finding good
In reporting and discussing availability with end values for A1 and A2.
users and customers, the main areas of interest
will nearly always be based around services and Assuming they are hardware components, this
not around components. could be derived from a combination of
manufacturers’ engineering specifications, (NOT
However, internal reporting for service from their sales literature), other similar
improvement purposes and for supplier installations and your own experience gained
management mechanisms will often require during testing or development.
reporting at the component level.
Using a combination of those three sources will
Calculating the Availability of tend to give realistic values for the availability
Multiple CI’s of individual components.
57
Lesson 4a Availability Management
So in the example shown, CI3 is a very good Setting up a Technical Observation Post or
candidate for attention, such as replacement T.O.P. is an expensive process because it
with a more reliable item or duplication by the involves bringing together a team of people to
addition of a parallel assembly as a replacement look at a service at a vulnerable period of its
for the single component CI3. life.
More sophisticated information can be put in If, for example, we know that on a monthly
the CFIA such as information that for service ‘B’ basis are availability problems while assembling
to run, either component 3 or component 4 data for end-of-month financial work, then a
need to be there but not necessarily both. Technical Observation Post might be set up to
look at this particular process.
This may require some extension to the
notation - which is often home grown or In effect the T.O.P. would be watching the
company-specific and which is beyond the process go wrong in order to more accurately
scope of this course. understand what’s happening. This is
particularly useful in cases where it proves
Another useful technique is called ‘Fault Tree difficult in test conditions to simulate the fault
Analysis’ or FTA. that is causing the loss of availability.
There are a couple of techniques that can help In this lesson we have been examining the
us here and they are called; System Outage Availability Management process
Analysis, SOA, and Technical Observation Posts
or T.O.P. Once you have completed this lesson you will
be able to define Availability Management and
SOA involves a detailed analysis of service describe how it relates to other ITSM
interruptions. It is really a post-mortem about components.
some of the more major incidents that have
occurred in the infrastructure and trying to find You will be able to recognise the main elements
some common underlying theme or cause for of the Availability lifecycle and understand the
the availability losses. terms MTBF, MTTR and MTBSI.
It requires significant inter-disciplinary work You will appreciate the main responsibilities of
between different teams to make this work and the Availability Management process and be
tends to be managed as a small project with a able to recognise several techniques which are
particular budget and reporting period. of use in this area.
58
Lesson 4b Capacity Management
Once you have completed this lesson you will The Capacity Management Process can be
be able to; regarded as something of a balancing act. The
organisation must provide enough capacity to
• Define Capacity Management, and its meet justified business demands, balanced
three sub-processes of Business, against the costs that the organisation can
Service and Resource Capacity afford to pay.
Management
There a two ‘laws’ associated with Capacity
• Identify Capacity Management’s Management, which offer an insight into the
demands placed on this process. The first is
• ongoing, ad hoc and regular activities ‘Moore’s Law’, which suggests that ‘processing
capacity doubles every 12 to 18 months.
• Describe the contents of the Capacity
Database and the Capacity Plan The second is a variation on ‘Parkinsons Law’,
which states that data expands to fit the space
What is Capacity Management? available for storage. This highlights a second
‘capacity’ problem, the one of supply and
demand. As greater capacity becomes available
In order that Service Level Agreements are
users will make use of it.
met, it is critical that sufficient capacity is
available at all times to meet the agreed
There is continual pressure from the business
business requirements.
and customers to increase capacity, but in
doing so there a costs incurred to the business.
Capacity Management ensures that IT
Ultimately, a decision has to be made over
processing and storage capacity provision
whether the cost of capacity provision provides
match the evolving demands of the business in
enough business benefit.
a cost effective and timely manner. Of all the
ITIL processes this can be regarded as one of
However, Capacity Management must justify
the most proactive.
the cost of any capacity increases. Broadly
speaking the objective is to provide the:
ITIL defines Capacity Management’s goal as:
• Right Capacity, enough but not to much
‘To understand the future business
• At the right cost
requirements (the required service delivery),
• And critically, at the right time
the organisations operation (the current service
delivery), the IT infrastructure (the means of
In theory, if Capacity Management processes
service delivery), and ensure that all current
are running well, providing the right level of
and future capacity and performance aspects of
capacity at the right time, then they should be
the business requirements are provided cost
invisible to the business, and to most aspects of
effectively.’
Service Level Management.
The Capacity Management process incorporates
In any organisation, there can be a huge
Performance Management, Capacity Planning,
number of capacity elements to be managed,
and monitoring and tuning activities. In a large
which could impact on business.
organisation there may be many people
working in a Capacity management team under
Those shown in the question represent just a
the leadership of a specialist. In smaller
few of the IT components, which Capacity
organisations it might be the role of a single
Management must address.
individual who is supported by technical
specialists from Networking, desktop and so on.
Interestingly, people are not usually thought of
in capacity terms, except where a shortage of
The Capacity Manager role requires excellent
people leads to other capacity problems. For
technical and business capabilities. The day-to-
59
Lesson 4b Capacity Management
example, if we don’t have enough service desk Finally, Resource Capacity Management
staff to fulfill commitments made in Service concentrates on the underpinning technology
Level Agreements. resources that ‘enable’ business services. It also
ensures that these resources, or Configuration
As we mentioned earlier in this lesson, items, are not over used. This sub process is
providing capacity to the business at the right also responsible for monitoring future
time is critical. If capacity upgrades are too late development and capacity of technical
then the infrastructure could fail. Failures might components, and reporting these findings back
already be occurring, for example, through to the business, so that they can be integrated
incidents and complaints reported to the service into future plans.
desk. Or internal monitoring tools might
indicate that we are operating close to capacity. The Capacity Management process has a
number of ongoing, iterative activities. These
Buying in extra capacity at short notice leaves activities include: monitoring, analysis, tuning
little negotiating power with external suppliers, and implementation, and are carried out in
and as such, is likely to be very expensive. Resource Capacity Management and Service
Conversely, upgrading the infrastructure to Capacity Management. They are not normally
increase capacity, to then find it’s under used used in Business Capacity Management, except
could in itself lead to financial problems. during business reporting. For example, to
show, through analysis of data gathered
Capacity Management is also involved in the through these activities, that transaction
reduction of capacity or as it is sometimes responses are slowing down.
known, ‘managing shrinkage’. In any
organisation the capacity of certain components The monitoring activity should include the
is being reduced whilst the capacity of others is monitoring of thresholds, and baselines or
being increased. An example of this might be profiles of the normal operating levels.
where a mainframe-based environment is Thresholds and baselines are set from the
gradually being replaced by a distributed analysis of previously recorded data, they are
service. The capacity requirements on the the ‘yardstick’ by which Capacity Management
mainframe will be falling while the capacity can measure utilisation of IT infrastructure
requirements on the servers will be increasing configuration items. All thresholds should be set
rapidly. below the level at which a resource is over-
utilised, or below the targets in an SLA. For
Capacity Management Structure example, a threshold might specify that the
usage on any individual CPU does not exceed
Capacity Management consists of three inter- 80% for a sustained period of one hour. If these
related sub processes, each working at different thresholds are exceeded, alarms should be
levels in the organisational structure. raised and exception reports produced.
60
Lesson 4b Capacity Management
Tuning can improve service delivery without used in Business Capacity Management’s
incurring costs associated with equipment reporting activity.
purchase. However, using skilled resources will
incur costs, particularly if they are sourced from Another on-going Capacity Management activity
outside the business. is Demand Management. The main objective of
Demand Management is to influence the
Tuning at service level can ensure that services demand for computing resource and the use of
don’t clash at times of peak demand. Any that resource.
excess demand can be controlled by Demand
Management, an activity that we will look at This activity can be carried out as a short-term
later in this lesson, or by sharing capacity, in a requirement because there is insufficient
multi-server environment, across several current Capacity to support the work being run,
servers. or, as a deliberate policy of IT management, to
limit the required IT capacity in the long-term.
Importantly, tuning should be carried out
initially in a test environment. Only when we Short-term demand management might be
are confident that the change will be a benefit needed if there is a partial failure of a critical
to the business, should it be implemented resource in the IT Infrastructure. Service
through the conventional change management provision might have to be modified until a
processes. replacement or fix is found.
61
Lesson 4b Capacity Management
However the CDB is unlikely to be a single Although Analytical modelling requires less time
database, and probably exists in several and effort that other modelling types, typically
physical locations. We will look at the make up the end results are less accurate.
of the CDB later in this lesson.
Simulation modelling involves the modelling of
Ad hoc activities discreet events, in other words what actually
happens millisecond by millisecond, as a
transaction passes from local pc through the
Modelling is an example of an ad hoc activity,
local area network, to server and so on. This
which is used in all Capacity sub-processes.
type of modelling can be very accurate in
Modelling tries to predict the behaviour of
predicting the effect of changes, but it is time
components and services under a given volume
consuming, and therefore costly, as it can
of work, particularly at peak times, ant tries to
involve numbers of staff in producing physical
understand the way in which current service
event simulations.
and resources are used, and the impact of that
usage on the IT infrastructure. It attempts to
However, Simulation Modelling can be cost
predict the future from our knowledge of the
justified in organisations with very large
past. In order to do this we establish a
systems, where the cost and associated
‘baseline’ model.
business implications are critical.
The baseline model reflects accurately the
Finally Benchmarking involves physically
performance that is being achieved. Once a
building a replica of part of the IT infrastructure
baseline is created, predictive modelling can be
and measuring such things as its response to a
done.
reduced workload, and extrapolating these
results, to see how it would perform under the
We can ask the ‘what if?’ questions about
‘real’ workload. Because Benchmarking involves
planned changes to the IT infrastructure. If the
the purchase of equipment, building software
baseline model is accurate then the results of
and simulating significant workloads, this is the
the predicted changes should be accurate.
most expensive modelling option, however, it
does give the most accurate predictive figures.
The major modelling types used by Capacity
Management are:
Another ad hoc Capacity Management activity is
Application Sizing. The primary objective of
• Trend Analysis
Application sizing is to estimate the resource
• Analytical Modelling
requirements to support a modified or new
• Discrete Simulation
application, and to ensure that it meets its
• Benchmarking
required service levels.
These modelling techniques vary in complexity
Application sizing has a finite lifespan. It is
and consequently cost, with Trend Analysis at
initiated at the beginning of a new application,
the top being the simplest and cheapest, whilst
or when there is likely to be a major change to
benchmarking being the most complex and
an existing one. Application sizing is complete
expensive. Lets look briefly at each of these
when the completed application is accepted into
modelling types.
the operational environment.
The Trend Analysis technique looks at various
This activity is performed together with
data over a period of time and attempts to draw
colleagues in system and service development,
a smooth curve through these figures,
to ensure that we are fully aware if the likely
extrapolating the graph data forward into the
impact of services being development, designed
future, as a way of predicting future trends.
or purchased, before they are implemented.
This provides Capacity Management with
Analytical Modelling uses mathematics to
important data on future resource
represent computer system behaviour. Typically
requirements, and this can be integrated in to
a model is built using a software package,
the Capacity Plan, as well as providing valuable
which can recreate a virtual version of a
information for purchasing and the development
computer system. When the software is
team. How we make programming, database
executed, ‘queuing theory’ is used to calculate
design and architecture design more resource
response times, and if virtual response times
efficient, is also covered by in the ‘Best Practice’
are sufficiently close to those recorded in the
guidance.
‘real life’ IT infrastructure, the model can be
regarded as accurate.
Finally, a ‘regular’ Capacity Management
activity is the production of a Capacity Plan,
which is typically created annually. Information
62
Lesson 4b Capacity Management
gained from the activities of monitoring, improve levels of capacity, or reduce costs –
demand management, modelling and preferably both!
application sizing will contribute to the
production of a Capacity Plan. We will be Carrying out ‘Effectiveness Reviews’ and
looking at the Capacity Plan in more detail later creating ‘Audit Reports’ form a basis for
in this lesson. checking that business benefits are being
achieved, and the process users are following
Inputs and Outputs of the Capacity the ‘rules’.
Management Process
Contents of the Capacity
To fully appreciate the scope of Capacity Management Database and the
Management, we will spend the next few Capacity Plan
minutes looking at the major inputs and outputs
to the process, and how these relate to the sub- Although the Capacity Management Database is
processes of Business, Service and Resource represented in the ITIL guidance as a single
Capacity Management. entity, it is unlikely to exist in this form in many
organisations. The main reason for this is that
Inputs to the BCM sub-process include, the much of the data held in a CDB is common to
external suppliers of new technology, existing that in a fully integrated Configuration
service levels and current SLAs, along with Management Database, therefore, there is an
proposed future services and related SLRs. argument for making the CMD part of a ‘Super’
Other important inputs to BCM include the integrated CMDB.
Business Plans, and any strategic plans
together with IS and ICT plans. Finally BCM Software tools used by Capacity Management
requires the Capacity Plan as an input, if one tools may have designed in to them, partial
exists. CMD functionality. If this information is
accessible by other software, than a ‘virtual’
The important inputs to the Service Capacity CDB can easily be created.
Management sub processes are; the service
levels and SLAs. Current information from Remember the data contributors to the CDB are
monitoring tools related to systems, networks the key to its success. Input from the business,
and services. The service review results, includes the ‘business strategy’ and the
including any issues raised. Incidents and business plan.
Problems related to capacity, and any SLA
breaches. Service Management will provide information
about SLAs and a full definition of the quality
RCM’s key inputs include incidents or problems processes in place.
related to a particular component. Monitoring
information related to component utilisation. It Data about manufactures specifications for
is considered important to keep utilisation existing and new technology, will be provided
below certain industry standard levels for a by the technical teams.
component type.
And finally, the IT Financial Management team
Financial Plans and Budgets are a major input will provide fiscal data. Additional financial
to all 3 sub-processes. information will be provided from the CMDB, in
its role as a ‘super’ asset register.
Outputs from the sub-processes include a
Capacity Database, Baselines and thresholds The Capacity Plan
information, which we looked at earlier in this
lesson. Capacity reports will be produced by all
The Capacity Plan is a major output of the
three sub-processes, including, Trend, Ad hoc
Capacity Management process. It has a
and exception reports.
standard structure and includes;
Other outputs include recommendations for
• Assumptions - about levels of growth.
SLAs and SLRs, as Capacity Management
• A Management Summary
activity will turn initial SLRs into achievable and
• Business Scenarios
cost effective service level quality clauses.
• A Summary of Existing Services, problems
Charging and costing recommendations are also
or issues with current services and current
produced.
levels of utilization
• A Resource Summary – which will show
SCM and RCM will be suggesting ‘proactive
what has happened to particular
changes’ and ‘Service Improvements’, to
components over the last year and since
the last Capacity Plan
63
Lesson 4b Capacity Management
64
Lesson 5a Service Level Management
• Define Service Level Management Such programmes are aimed at achieving cost-
according to ITIL best practice. effective improvements to the services offered
by the IT service provider, in a rapidly changing
• Identify the core Service Level technical environment, without necessarily
Management sub-processes and activities being driven by customer demand.
ITIL defines its goal as: There are a number of ways that IT services
can be provided – each having their merits and
“To maintain and gradually improve business draw-backs.
aligned IT service quality, through a constant
cycle of agreeing, monitoring, reporting and In the simplest scenario there is just the
reviewing IT service achievements and through external provider of the IT service and the
instigating actions to eradicate unacceptable customer organisation. Services will be
levels of service.” provided on the basis of a contract between
these two parties.
Service Level Management exists to ensure that
service targets, such as availability or services, Whilst this has the benefit of simplicity, it is a
response times and so on, are agreed and risky strategy and one that generally leads to
documented in a way that the business poor support for the users and poor value for
understands. money for the corporate customer.
It is also there to ensure service achievements The next approach is often said to involve an
are monitored and reviewed on a regular basis. “intelligent customer” role. That is, somebody
who negotiates on behalf of the customer with
Service Level Agreements, which are managed suppliers for service delivery. That customer
through the Service Level Management Process, has a Service Level Agreement with the Service
provide specific targets against which the Level Management process, and the service is
performance of the IT provider can be judged. underpinned by an ‘Underpinning Contract’ with
the suppliers.
The Service Level Management Process is
responsible for ensuring Service Level In this situation, the internal IT department
adds little or no value. Such arrangements are
Agreements and underlying Operational Level
common where an ‘off-the-shelf’ package
Agreements or underpinning contracts are met.
solution is being provided by the supplier.
65
Lesson 5a Service Level Management
66
Lesson 5a Service Level Management
Customer level documents might be authorised For example, an internal software development
by Department Heads, Finance, Planning, HR team might have in place an OLA between
and so on. Individual Service Level Agreements themselves and Service Level Management.
would be authorised at the next management This OLA offers, amongst other things, a
level down in each of these departments. guaranteed response time to serious problems,
of no more than 2 hours.
The general principal is that SLA’s are
authorised by paying customers on behalf of In order to guarantee these service levels, the
users in their part of the organisation. software development team might have an
underpinning contract in place with their
So what exactly is an SLA? development software vendor, ensuring that
problems can be resolved well within this 2 hour
Well in structure SLA’s are rather like contracts, time frame.
but they are not in themselves legal documents,
However they can be included in a legal A word of warning here, it’s critical that any
contract, particularly when establishing SLAs commitments made in an OLA are directly
directly with external suppliers. In such cases supported by the underpinning contract. For
an SLA would be included in the contract as a example, committing to a 4 hour fix time in an
schedule. OLA would be useless if our underpinning
contract only commits our supplier to a 6 hour
An SLA which is used internally between fix time!
departments has no legal weight, it’s simply a
document that has a contractual structure to it.
67
Lesson 5a Service Level Management
In the last few pages we have been looking at So lets look at these 4 stages individually, and
those agreements and contracts, which form an see how they fit together to form a complete
important part of Service level Management. Service Level Management process.
But how do we establish which services are The first stage is Initial Generic. The first
available for inclusion in these agreements and activity at this stage, assuming that a Service
contracts, and which ones our customer or Level Management team is in place, is to build
users would like? the initial Service Catalogue. As we mentioned
on the previous page, this activity documents
Well, there are two other important documents all currently available services, and which
in Service Level Management, which can help customers or users are using them. It also
us with this decision, and these are ‘A Service records whether they are formally documented
Catalogue’, and ‘Service Level Requirements or in any SLA’s, and whether it’s a service which
SLR’s.’ needs to continue.
A Service Catalogue contains a list of all It isn’t possible to document every possible SLA
services used by each customer group. A clause in the catalogue, it’s more important to
service Catalogue could be used internally by understand the scope of the catalogue, and the
the service provider, for example, the Service services within it, and also any major problems
Desk might use it to help them identify those with services, and any suggested changes to
customers entitled to a higher level of service. them.
It can also be used externally as a marketing
tool, providing a shop window, showing all the The second related sub-process is planning the
services on offer to the business. Commonly, SLA structure and establishing which SLAs we
Organisations now make this available on their need to create. This activity involves prioritising
intranet as a form of advertising, and the modification of pre existing SLAs, in order
generating ‘buy in’ to the services. to re work them into standard formats. Ask
yourself – are there any new services being
Service Catalogues exist in a number of forms. developed or purchased from a software
They are often created as an internal document, provider that might provide a better starting
listing existing services when Service Level point?
Management is initially established. At a later
stage, it might be published to potential Assuming, we’ve built the Service Catalogue,
customers, and the wider business as a whole, agreed the SLA structure, and prioritised the
in a more ‘glossy’ format. work, we can move onto the second stage of
‘Initial per-service’, and its related sub
In order to establish their exact requirements, processes where we address customer specific
the customer develops a Service Level issues.
Requirement document. When doing so, the
customer should be realistic about potential The first point is to establish Service Level
levels of service, and related costs. Remember Requirements or SLRs. Find out what users
this is not a wish list, and sensible advice would really like from that service, and what
should be offered from the Service Level customers are prepared to pay. We should try
Management team. There is no specific format to establish SLRs by checking requirement
for SLR’s, and each organisation will document documents that exist for new services in
it in their own way. development.
It’s important to remember that these It’s not uncommon for organisations to arrange
documents, along with SLA’s. OLAs and UPCs training programmes for senior customers, to
are all subject to the ITIL Change Management help them understand what SLRs are, how they
Process. should be specified and what is a realistic
request in service level terms.
In the next few pages we will look in some
detail at the Service Level Management sub- The second sub-process uses those SLRs to
processes. These sub-processes can be grouped review the underpinning contracts and OLAs
into 4 stages: already in place with internal and external
service providers. This might involve
• Initial Generic discussions about upgrading current statements
• Initial Per Service on service level and provision.
• On-going Per Service
• On-going Generic Once we are happy with both our OLAs and
UPCs we can create a draft SLA. The intention is
to put actual metrics against various service
68
Lesson 5a Service Level Management
quality clauses, including fix times for problems, and Service Level Management as a whole.
transaction response times and so on. These These processes include maintaining the
statements should be supported by ITSM Service Catalogue and updating it with new
colleagues, such as Service Desk, Capacity, services. Some organisations have automated
Availability and Problem Management, amongst document links from the Service Catalogue, to
others. individual SLAs, so when an SLA is changed,
then that change is reflected in the catalogue.
When the draft SLA is available, agreement Remember the Service Catalogue falls under
should be sort from customers and users that it the Change Management control process.
represents an adequate specification of service.
This is a process of negotiation, and might A further activity is to review the Service Level
involve talking to external and internal suppliers Management process itself. By establishing
about the cost of improving service quality Critical Success Factors (CSTs) we can measure
parameters to the customer. It might require performance, we can also set KPI’s or Key
several iterations of the process before Performance Indicators for what is considered a
agreement can be reached. Usually, the cost of successful service.
providing certain levels of service becomes
apparent to customers fairly quickly, resulting The final activity is to consider a Service
in more realistic negotiations. Improvement Programme or SIP. Service Level
Management should look at all provided
Once the agreement is formally signed, the SLA services and their associated quality
must be implemented. This involves informing requirements to see how we can improve
all parties constrained by the SLA, that it is in service levels without significant increases in
place. For example service desk staff, third cost to the business. This proactive SLM activity
party suppliers, users and so on. involves talking with colleagues in Availability
and Capacity Management, and IT
The third stage in the SLM process, includes the Infrastructure Management, to identify ways of
on-going per service activities of monitoring, improving response times, and improving
reporting and review and modify. availability to the business. This activity uses
SLA contents as a trigger for service
Monitoring involves using the technical tools improvement.
available to those working in Service
Management, to monitor the users important Reporting on Service Level
SLA clauses, such as response times for
enquiries at the Service Desk. SLM isn’t Achievements
responsible for the technical implementation of
monitors, however SLM takes responsibility to We briefly mentioned the activity of reporting
ensure that the necessary monitors are in earlier in this lesson. Reporting can be
place. Monitors can provide useful reporting subdivided in to either external or internal
information to IT and the business, and we will reporting.
be looking at reporting in more detail later in
the lesson. Internal reporting involves monitoring service
quality in SLAs and related OLAs and UPCs. This
Review and modification takes place at regular detailed monitoring of service quality, is
intervals via service review meetings. These normally set up by the Capacity and Availability
meetings are held at regular intervals, weekly Management processes. They will be interested
isn’t uncommon, but most likely monthly. The in all activity which affects all service clauses,
objective of these meetings is to produce short including breaks in service, time to repair,
reports on the way the SLA is working, debate response time to users and so on.
any problems or issues, and discuss any
changes to the SLA, which might be needed. Monitoring OLAs and UPCs will help us
These reports should be written in simple understand why SLA breaches are occurring,
business language, and state whether we have and also to identify future trends, and possible
met the SLA or not, descriptions of where we future SLA breaches. Remember you can’t
failed, and explanations of how we are going to control things that you can’t monitor.
prevent the failure occurring again. Remember
however, that any suggested changes to SLA’s External reporting should be written in a simple
should be authorised by the Change and clear way. An exception report is a typical
Management Process. example of external reporting, and it should
simply point out when, where and why SLA
The fourth and final Service Level Management breaches or near breaches occurred. It should
process stage is defined as ongoing generic. It also explain how we intend to prevent things
involves sub processes, which relate to SLA’s from getting worse.
69
Lesson 5a Service Level Management
70
Lesson 5a Service Level Management
so that they are able to answer any questions Benefits & Problems
about these incidents.
The benefits of and potential difficulties with
Review meetings can lead to suggestions for Service Level Management are listed on Page
change, remember however, they are not the 45 of the little ITIL book and in Section 4.2.1 of
place where changes are authorised. the Service Delivery Manual.
71
Lesson 5b Financial Management for IT Services
When you have completed this lesson you will For example, ITSM decision making will include
have a broad appreciation of Financial evaluating suggested changes and formulating
Management in an IT services context. business cases This work might include
calculating return on investment and is done by
• You will be able to explain the main reasons the IT Services Financial Management team on
why financial management is necessary and behalf of the IT service management group.
you will be able to recognise the three main
elements that define the scope of financial Financial forecasting is also a critical element of
management. the decision making process and can help avoid
cots over-runs, or resource shortages.
• You will be able to identify 6 types of cost
that are commonly encountered and then Containing Costs
classify these into one of six accounting cost
categories. This includes those costs incurred internally and
externally, through any supply contracts we
• Finally, you will be able to describe seven may have. It is very important that we know
different charging policies that can be about ALL our costs and are able to manage
applied to IT services. them.
72
Lesson 5b Financial Management for IT Services
Finally, The Recovery of Costs from the users of There is a hierarchical dependency between
the resources is also an important element in Budgeting, Accounting and Charging which will
the Value for Money equation. Any decision to often develop as an organisation’s financial
recover IT costs, either totally or partially, is a policies become more mature and increase in
high level business decision usually taken at both scope and complexity.
Board level and is not considered mandatory
within ITIL. The starting point might be to introduce
budgeting on a one year ahead basis, for
The Scope of Financial example. This would tell us how much IT is
costing, but shed no light on how that figure is
Management for IT Services arrived at. Also, at this stage, we have done
nothing to recoup the costs from the business.
IT financial management for services is
normally considered as having three main
Knowing in detail where the money and the
areas, each of which has a number of sub-
resources it buys are being used only becomes
processes.
possible once we introduce the accounting
processes.
Those areas are Budgeting, Accounting and
Charging.
The ability to recoup this money only becomes
possible once we move into charging processes.
Budgeting is concerned with:
The ITIL guidance is that we should implement,
• Predicting the money needed to deliver the
at the very least, budgeting and accounting.
IT service.
Charging, as we have said is optional as far as
ITIL is concerned.
• Seeking to secure that money from the
business, and;
We should do this, almost certainly, before we
attempt charging. It is theoretically possible to
• Monitoring and controlling IT spend against
charge without understanding who is using
that budget over the given period.
what resource, but is unlikely to be acceptable
to the business.
• Accounting on the other hand is the set of
processes that allows the IT service provider
Accounting without budgetary control makes
to demonstrate where, within IT, the money
little sense and in general charging without
from that budget has gone.
accounting is not a good option.
Together, the budgeting and accounting
Once accounting is in place, we have a vehicle
processes identify all the costs incurred by the
for performing cost benefit analysis and return
IT service management, and enable us to
on investment calculations.
understand where that money is going to, in
terms of business support.
IT financial management will expect such
calculations to be done whenever there are
If we decide to use charging for IT services,
proposals for significant changes, or for the
then we are attempting to do is recover money
creation new services.
from the customers of the services.
The exact models for this will be dependent on
These charges must be demonstrated to be
standards for accounting within the
equitable, between IT and the business.
organisation.
As well as being equitable, changes must also
bear some relationship to the costs. How close Types of Cost
that relationship is, is usually a matter of
debate in organisations. It is very useful when creating a budget to
understand all of the resources that we have by
The closer we want the charges to relate to the breaking them down into various cost types.
cost to the IT organisation of service provision,
the more complex the charging process will be The suggested high level cost types that ITIL
and the more the overhead in gathering the recommends are:
necessary data. Hardware, such as computers, networking
equipment, data storage devices and so on.
Once they’ve been agreed between the
customers and the service level management Software, which would include operating system
team, charges must be documented in the SLA software and applications.
for each service that is charged for.
73
Lesson 5b Financial Management for IT Services
People, in other words, salaries, taxes, stands at zero and the full £10,000 has been
expenses, benefits and other costs of recovered from operational costs.
employment.
This is the process that accountants call
Accommodation, for example, offices, machine depreciation.
rooms, utilities, storage space and so on.
Conversely, some companies try to roll up some
External Service, covering items which might be operational costs and classify those as capital,
outsourced, such as development work, ISPs, so that they too can be written off over a
disaster recovery facilities and the like. number of years. A good example of this is
software development.
And finally Transfer, which is used to account
for the cross-charges that can take place A company may decide that is it has spent
between different parts of the business. For £100,000 on salaries to develop a software
example, if it was necessary for an Excel expert application then, once it is completed that
in Finance to give two days training to someone application becomes an asset of the company
in Human Resources, because IT lacked the with a value of £100,000 – and the depreciate
resource to do this, then IT would expect a that asset over the years of its life.
cross-charge for that persons time to come
from the Finance Department. Capitalisation and depreciation policies are very
much a concern for the central accounting
A useful aide-memoir for these cost types is the functions and are in many respects governed by
acronym HAS PET – as you can see. laws to prevent fraud and tax evasion. ITIL
suggests that we take advice from the main
Cost Classification accounting section on the use of depreciation.
Once the cost elements have been identified Direct and Indirect Costs
and their types understood, they will need to be
classified for accounting and financial purposes. Costs can further be classified into direct and
indirect costs.
As a minimum, ITIL recommends that costs
need to be classified as either Capital or Direct costs refer to a cost that is directly
Operational costs. attributable to a customer or a group of
customers. For example, if we are asked to buy
Capital expenditure is assumed to increase the a package and a server for the use of Human
total value of the company, while Operational Resources only, then we could regard these
expenditure does not. package and server costs as being a direct cost
that can be ‘charged back’ to the HR function.
So capital costs relate to outright purchases of
fixed assets and may apply to accommodation, Indirect costs cannot be allocated simply to one
computers, and workstations, for example. customer or group. They are costs that are
shared amongst groups.
Operational costs, on the other hand, can be
thought of as day-to-day running costs. Once There are commonly two types of indirect cost –
money is spent on these it is no longer available absorbed costs where the costs can be
to the company as an asset. Operational costs apportioned across a number of different
include salaries, rental of equipment or groups based on their respective usage of the
buildings, and licenses for software. resource concerned.
It is sometimes the case that organisations And unabsorbed costs, where it is too difficult to
make capital purchases but want to represent in determine who is using how much of the
their accounts the fact that this capital loses resource and so the cost is allocated as a simple
value over time. percentage uplift to all costs – in other words
an overhead.
So if £10000, say, is spent on an item of
equipment which is expected to last for three An example of this might be the cost of the
years, the assets of the company will be service desk where, rather than attempt to
immediately increased by £10,000. work out which group was behind every call and
how much time that took, we take the cost of
But in each of the next three years £3333 will the service desk in total and distribute it across
be taken out of the operational expenditure and all of the customer groups, based on their
the assets will be decreased by that amount, usage of other resources.
until at the end of three years the asset value
74
Lesson 5b Financial Management for IT Services
Finally, there are fixed and variable costs. Fixed invoices and for resolving disputes. All of that
costs remain constant regardless of usage, requires gathering and processing of data, and
whereas variable costs increase in proportion to a mix of financial and IT skills in order to be
the usage made of a resource. effective.
An example of a fixed cost might be a leased What is important is that costs are understood
communication line – the price of which does and that there is budgetary control. People are
not change regardless of how much or how little then aware of how much their business is
it is used. On the other hand an ISDN line spending on IT services, but they are not
might be an example of a variable cost, charged.
because it may be charged for on the basis of
the amount of traffic that uses it. The problem with a no-charging policy is that it
does not provide a means of managing
The concept of fixed and variable can also be customer expectations or manipulating demand.
applied to charging. But there are potential
pitfalls here. If it is decided to charge for services, then “Cost
Recovery” - attempting to get back from the
If a service that is charged for on a fixed price other business units just the cost of providing
basis is based on cost elements that are IT to them is known as the ‘zero-balance’
variable then if the workload increases policy.
dramatically the cost of providing the service
may end up being greater than the money Alternatively, a “Cost Plus” policy is where IT
being recouped. expects to recoup more than they spend,
perhaps as a mechanism for dealing with
The converse of this is also true – if charges are potential variation in demand over a number of
variable, but costs are fixed, difficulties can years, or possibly as a basis for funding
arise if the volumes end up being less than investment in new infrastructure components,
predicted. which will be a benefit to the business as a
whole.
Section 5.3 of the Service Delivery manual – or
Page 50 of the little ITIL book contains a useful It is also possible to subsidise the service and to
illustration of how the different cost types and go for a ‘Cost Minus’ policy. Here, we are not
categories that we have discussed can combine attempting to recoup all of the costs from the
to build a cost model for arriving at the total individual business units but do want to achieve
cost figure for a given customer. some element of cost consciousness.
It is worth spending some time in studying this The degree of ‘subsidy’ from the business as a
cost model. whole will be a high level management decision.
The decision on whether to implement charging, ‘Market rate’ charging uses an external cost
and if so on what basis, is not normally a comparison, where we see what external
decision for IT financial management – those providers would charge the business for the sort
kind of high level business decisions are almost of services we’re offering and use that figure as
always made at very senior management levels our charge. This is often a useful policy when
within the business. outsourcing is being considered.
There are a number of general charging policies Some organisations allow their IT departments
which are usually considered. to sell their services externally to the company,
in other words they become a profit centre in
It is quite valid for the organisation to decide their own right. This will tend to mitigate in
that they are not going to charge for IT favour of market-rate pricing, and the business
services. will need to decide how the extra money
generated will be used.
One of the reasons for deciding on this policy,
might be that there are costs involved in Finally we might decide on a negotiated “Fixed
charging. There will need to be a mechanism Price’ policy, where the actual price we charge
for setting the charges, for sending out bills and
75
Lesson 5b Financial Management for IT Services
76
Lesson 6a Continuity Management
Your answer might be; ‘Well, the business is IT Service Continuity Management or ITSCM
insured, so the insurance company will sort focuses on the IT services that support the
everything out.’ business, and it’s this process, which the ITIL
guidance concentrates on. Remember,
But what happens at start of business tomorrow however, that there is no point in making huge
morning? Firstly and pretty obviously, day-to- efforts to maintain IT services under disaster
day business operations are going to stop, no conditions, if the business has no Business
office, means no staff accommodation, no staff Continuity Management process in place.
means no ongoing business activity. As a result
you can’t service customer accounts, take or So, if staff don’t know where they should go
despatch orders, collect payment and so on. It’s after a disaster, or the alternative office location
likely that you will lose existing and new hasn’t any chairs or desks, then there’s little
customers, sales and revenue. Ultimately the point in having a ITSCM process in place. Put
business could fail. simply, it’s important that IT service
management staff point out the critical need for
This all seems pretty unlikely, but if we consider the ‘business’ to have a Business Continuity
some other scenarios, such as a computer virus Plan.
infecting the servers via email, or a disgruntled
ex employee deleting critical data, these Vital Business Functions -VBF’s
potential threats seem more likely, particularly
when you consider that statistics suggest that Business Continuity Management, and so by
80% of businesses that suffer a ‘disaster’, go association, ITSCM are primarily concerned with
out of business within six months of it Vital Business Functions or VBFs.
happening.
VBF’s are the critical parts or components of a
So how does a business prepare for such service, and as such must be ‘reinstated’ as
eventualities? Well, one very good way, is to quickly as possible.
implement Business Continuity Management or
BCM. For example, your bank has a network of
ATM’s, which dispense cash and offer a
selection of other services, including printing or
77
Lesson 6a Continuity Management
displaying a balance. The bank might consider Continuity lifecycle and its four stages, which
that the only VBF performed by the ATM is the are;
dispensing of cash, and not the other services.
• Initiation
The role of ITSCM is to identify the IT VBF’s and • Requirements and Strategy
services, and agree with the business how • Implementation
quickly those VBF’s and services need to be • Operational Management
recovered.
The initiation stage
Sometimes a service, which is reinstated The activities to be considered during the
quickly, might have components missing, or the initiation process will depend on the level of
throughput performance of the network might contingency facilities already in place with the
be reduced. It’s important that agreement is organisation. Some parts of the business may
sort from business that a ‘reduced service’ is have already established individual continuity
better than no service at all. plans based on manual workarounds, and IT
may have developed contingency plans for
Not all aspects of the IT services will require systems perceived to be critical. This can
contingency plans in the event of a disaster. provide a worthwhile starting point for the
The business may be prepared to live without process, however, effective ITSCM is dependent
certain aspects of the IT infrastructure in the on supporting vital business functions, and
short term. So the focus of ITSCM is directed at ensuring that the available budget is applied in
the Vital Business Functions, and a relevant the most appropriate way.
amount of the available budget is assigned to
each. This amount of money assigned to a VBF The initiation process covers the whole of the
is proportional to its business importance. organisation and consists of the following
activities:
ITSCM has to have strong linkages with other
ITIL disciplines, in particular Availability • Policy Setting
Management and Service Level Management. • Specifying terms of reference and scope
For example, statements in SLA’s should define • The allocation of resources
what service levels are likely to be available • Defining the project organisation and control
under ‘disaster’ as well as ‘normal’ operations. structure
• And finally, agreeing the project and quality
Other linkages include: plans
78
Lesson 6a Continuity Management
An example of this might be a smoking ban, rating. The business then has to take measures
and the introduction of an automated sprinkler to deal with that risk.
system. Implementing counter measures can be
very costly, so a business case might be In order to do this type of risk analysis, it is
required to justify the level of investment. useful to have a Service Catalogue available.
You’ll remember that the service catalogue
Also during the implementation phase, featured in the lesson on Service Level
contracts will be signed with third party standby Management, and it contains a list of services
facility providers, if they are required. available to customers or users. We can use
information from this document to help asses
The final stage ‘Operational Management’ is the risk levels on different IT services.
responsible for educating all users and IT about
the service continuity processes, and Risk Analysis can also be applied at component
specifically what will happen in the event of a level, by looking at Configuration Items, and
disaster. judging what risks they are subject to.
Also remember that people will need to be This analysis could identify a component failure
trained in their ‘disaster recovery plan’ roles. risk in a particular service. We could mitigate
For example, somebody will have to liase with the risk, by sharing it with another service, that
the press in a public relations role, and training is made up of the same components.
might be needed for this.
Any component within the IT infrastructure that
Risk Assessment and Counter has no backup capability, and can cause impact
Measures to the business and users when it fails, is
known as a SPOF, or Single Point Of Failure. A
particular concern of ITSCM, are ‘Hidden
There a number of other approaches to
SPOF’s’. An example of a hidden SPOF might be
assessing risk, perhaps the simplest looks at
the point where multiple data cables enter an
the probability of something occurring, and the
office via an underground duct. A Significant
impact if it did. This approach can be
failure would occur if the cable were severed
represented in a matrix format as shown here.
during building works.
The highest risk status being one with both a
high probability and impact. Conversely, a low
impact and a low probability would mean a ‘low Contingent Risk Countermeasures
risk’ category. A business response to this low
level risk might be to ‘just deal with it if it ITIL suggests a number of possible options
happens.’ when dealing with a ‘disaster recovery’
situation.
We mentioned the CCTA’s Risk Analysis and
Management Method or CRAMM earlier. This The first option is ‘do nothing’. Surprisingly
involves the identification of risks, any this can be a valid response, if the business has
associated threats, vulnerabilities and impacts, decided that the complete loss of some service
together with the subsequent implementation of in a disaster is acceptable. For example, the
cost justifiable counter measures. business might have insurance in place to cover
any potential ‘loss of business’.
CRAMM is a very useful method for looking at
threats that might affect the availability of ‘Manual back up’ can be an effective interim
service, as it focuses on asset values. Assets measure until the IT service is resumed. Any
could be hardware, software, people, buildings, procedures should be well documented and
telecommunications and so on. It then understood. This is possibly the most unlikely
examines the various threats that could exist, option suggested by the ITIL guidance. Would it
and how vulnerable the assets are to these be possible for example, to go back to manual
threats. The results can provide a ‘risk rating’ ordering for a short period of time, rather than
which is very useful to the business. a computerised system?
For example, we are generally aware there is a The third option is a ‘Reciprocal
threat of flood. We might then find that our Arrangement’, where organisations agree to
mainframe computer systems are vulnerable to back each other up in an emergency. This is
this threat, because they are housed in a site, rarely used now except for the off-site storage,
which is below the water level of a tidal river. as it assumes that both organisations have
The asset would be significant in terms of the enough spare capacity to fully support the
computer equipment and the services based on other.
it, and therefore this would give us a ‘high risk’
79
Lesson 6a Continuity Management
Intermediate Recovery is also known as The most difficult and expensive test type, is
‘Warm Standby’ and is used where recovery is the full, unannounced test. This can be the
needed between 24 and 72 hours. A ‘Warm most effective way of finding flaws in the plan,
Standby’ facility will have the required but it’s the most disruptive to the day-to-day
computer equipment in place, but it wouldn’t be business activities.
configured or loaded with current operational
software. Key ITSCM Decisions
Immediate Recovery is also known as ‘Hot
There are a number of critical decisions, which
Standby’. This would normally involve the use
must be made by the ITSCM process.
of an alternative site with continuous mirroring
of the live environment and data. Recovery
An important one is deciding on how many
could be almost instantaneous, but the general
copies of the Continuity Plan we should have,
definition of immediate recovery is to allow up
and where they are going to be kept. For
to 24 hours for full recovery.
example, it would be very risky to have just one
copy of the plan stored at the site it provides
There are potential risks from having a ‘hot
contingency for! Many organisations keep plans
standby’ site in very close proximity to the
at the alternative site, or a local bank. The IT
business’s main site. Although it reduces
Service Business Continuity Manager may well
logistical and network issues, the whole site,
keep copies at home.
including the ‘hot standy’ could be at risk from
a disaster. So combinations of these options are
Remember that all copies should be kept in
sometimes used, and might include the use of a
‘sync’ to reflect any changes to the
‘hot standby’ third party site for a two or three
infrastructure or the plan.
days, while the internal intermediate site is
configured. This would reduce third party costs,
Another key decision is how and when to invoke
but would involve moving site twice.
the contingency plan. How long are we going to
wait before we act after a major failure?
80
Lesson 6a Continuity Management
A comprehensive list of all third party We looked at the relationship between the two
infrastructure suppliers should also be drawn processes, and have seen how ITSCM defines
up. Including those for operational and recovery key IT activities as Vital Business Functions.
systems. It’s also important to tell them to visit
the back up site if they are called out. We saw how ITSCM links to other ITIL
processes, and went on to look in some detail
Similarly, the details of third party contractors, at the Business Continuity Lifecycle.
particularly those who are providing recovery
services, should also be to hand. ITIL helps We have seen some of the Risk Analysis
here, by providing a Pro Forma disaster techniques used in the Requirements and
recovery plan that can be used as a basis for Strategy stage of the lifecycle, and listed all of
creating our own version. This pro forma the ITIL Recovery options.
contains an annex for all the contact details.
And finally, we looked at how to test IT Service
And finally. Ask the question, does our Continuity Plans, and at some of the key
contingency supplier, have in place, their own decisions required by the IT Service Continuity
contingency plan. There have been several Management process.
recent examples, where third party recovery
service providers, have been literally ‘deluged’
by demand. For example, serious flooding has
resulted in them receiving multiple requests for
help. Leaving them unable to fulfil their
contractual obligations.
81
Lesson 7a Passing the ITIL Foundation Exam
There are currently three examination levels The Foundation questions can be categorised
and associated qualifications, they are: into three types:
There are three internationally recognised ITIL Those that you find really easy and can answer
certificates; Foundations, Practitioner’s and without too much thought. Although do be
Manager’s. careful with the exact semantics of some of the
questions and make sure that you have
This course only addresses the first of these, properly read the questions.
which is, as you would expect, the entry-level
qualification. It is a prerequisite for going on Those that you probably know the answer to
the take the more advanced certificates. but the wording of the question needs some
digesting. There are a lot of “negative” type
The objective of the Foundation exam is to questions so do be careful over these.
confirm a very broad-brush knowledge across
the whole of ITIL and therefore does not Those that, even though you understand the
demand a very detailed knowledge within any question, you are not 100% sure of the answer.
specific area.
A good strategy is therefore to do the exam
In simple terms this is a test that you are paper in three passes. This is something that
broadly familiar with the contents of the Service you have not had the luxury of doing in the
Delivery and Service Support manuals. exam simulator.
82
Lesson 7a Passing the ITIL Foundation Exam
When you are first presented with the paper, Now one last exercise.
work your way through, answering all the
questions where the right answer is ITIL is nothing if not full of acronyms – and
immediately obvious to you. Avoid any many of the questions in the Foundation exam
temptation to deliberate too long over any assume that you are familiar with most of
question. If in doubt move on to the next one. them.
This first pass will ensure that – in the unlikely So it is worthwhile running through the list of
event that you do run out of time – at least you acronyms given in the little ITIL books and the
will have answered all the easy questions. For manuals themselves and memorising the less
anybody who has done the right level of obvious ones.
background study and preparation this alone
will probably be enough to secure a pass. In the meantime try this little test. Use your
mouse to drag and drop the right words into
Now go back to the beginning of the paper and position to correctly interpret these acronyms.
start work on all the second category of
questions.
83
Acronyms
CI Configuration Item
COP C d fP ti
84
Acronyms
DR Disaster Recovery
DT Down Time
GUI G hi lU I t f
85
Acronyms
HD Help Desk
ID Identification
IP Internet Protocol
IR Incident Report
IT Information Technology
JD Job Description
KE Known Error
KER K E R d
86
Acronyms
PC Personal Computer
PM Problem Management
PR Problem Record
QA Quality Assurance
QMS Q lit M tS t
87
Acronyms
RAG Red-Amber-Green
88
Acronyms
TP Transaction Proccessing
89
Glossary of Terms
Absorption costing A principle whereby fixed as well as variable costs are allotted to cost
units and total overheads are absorbed according to activity level. The
term may be applied where production costs only, or costs of all
functions are so allotted.
Action lists Defined actions, allocated to recovery teams and individuals, within a
phase of a plan. These are supported by reference data.
Alert phase The first phase of a business continuity plan in which initial emergency
procedures and damage assessments are activated.
Allocated cost A cost that can be directly identified with a business unit.
Apportioned cost A cost that is shared by a number of business units (an indirect cost).
This cost must be shared out between these units on an equitable basis.
Bridge Equipment and techniques used to match circuits to each other ensuring
minimum transmission impairment.
90
Glossary of Terms
Build The final stage in producing a usable configuration. The process involves
taking one of more input Configuration Items and processing them
(building them) to create one or more output Configuration Items e.g.
software compile and load.
Business recovery The desired time within which business processes should be recovered,
objective and the minimum staff, assets and services required within this time.
Business recovery A template business recovery plan (or set of plans) produced to allow
plan framework the structure and proposed contents to be agreed before the detailed
business recovery plan is produced.
Business recovery Documents describing the roles, responsibilities and actions necessary
plans to resume business processes following a business disruption.
Business recovery A defined group of personnel with a defined role and subordinate range
team of actions to facilitate recovery of a business function or process.
Business unit A segment of the business entity by which both revenues are received
and expenditure are caused or controlled, such revenues and
expenditure being used to evaluate segmental performance.
Capital Costs Typically those costs applying to the physical (substantial) assets of the
organisation. Traditionally this was the accommodation and machinery
necessary to produce the enterprise's product. Capital Costs are the
purchase or major enhancement of fixed assets, for example computer
equipment (building and plant) and are often also referred to as 'one-
off' costs.
Capital investment The process of evaluating proposed investment in specific fixed assets
appraisal and the benefits to be obtained from their acquisition. The techniques
used in the evaluation can be summarised as non-discounting methods
(i.e. simple pay-back), return on capital employed and discounted cash
flow methods (i.e. yield, net present value and discounted pay-back).
91
Glossary of Terms
Change Advisory A group of people who can give expert advice to Change Management
Board on the implementation of Changes. This board is likely to be made up of
representatives from all areas within IT and representatives from
business units.
Change authority A group that is given the authority to approve Change, e.g. by the
project board. Sometimes referred to as the Configuration Board.
Change control The procedure to ensure that all Changes are controlled, including the
submission, analysis, decision making, approval, implementation and
post-implementation of the Change.
Change document Request for Change, Change control form, Change order, Change
record.
Change history Auditable information that records, for example, what was done, when it
was done, by who and why.
Change log A log of Requests for Change raised during the project, showing
information on each Change, its evaluation, what decisions have been
made and its current status, e.g. Raised, Reviewed, Approved,
Implemented, Closed.
Change record A record containing details of which CIs are affected by an authorised
Change (planned or implemented) and how.
Closure When the Customer is satisfied that an Incident has been resolved.
92
Glossary of Terms
Computer Aided A software tool for programmers. It provides help in the planning,
Systems analysis, design and documentation of computer software.
Engineering
Configuration A database which contains all relevant details of each CI and details of
Management the important relationships between CIs.
Database
Configuration A document setting out the organisation and procedures for the
Management plan Configuration Management of a specific product, project, system,
support group or service.
93
Glossary of Terms
Cost effectiveness Ensuring that there is a proper balance between the quality of service
on the one side and expenditure on the other. Any investment that
increases the costs of providing IT services should always result in
enhancement to service quality or quantity.
Cost Management All the procedures, tasks and deliverables that are needed to fulfil an
organisation's costing and charging requirements.
Cost unit In the context of CSBC the cost unit is a functional cost unit which
establishes standard cost per workload element of activity, based on
calculated activity ratios converted to cost ratios.
Costing The process of identifying the costs of the business and of breaking
them down and relating them to the various activities of the
organisation.
Customer Owner of the service; usually the Customer has responsibility for the
cost of the service, either directly through charging or indirectly in
terms of demonstrable business need. It is the Customer who will define
the service requirements.
Data transfer time The length of time taken for a block or sector of data to be read from or
written to an I/O device, such as a disk or tape.
Definitive The library in which the definitive authorised versions of all software CIs
Software Library are stored and protected. It is a physical library or storage repository
(DSL) where master copies of software versions are placed. This one logical
storage area may in reality consist of one or more physical software
libraries or filestores. They should be separate from development and
test filestore areas. The DSL may also include a physical store to hold
master copies of bought-in software, e.g. fire-proof safe. Only
authorised software should be accepted into the DSL, strictly controlled
by Change and Release Management.
The DSL exists not directly because of the needs of the Configuration
Management process, but as a common base for the Release
Management and Configuration Management processes.
Delta Release A release that includes only those CIs within the Release unit that have
actually changed or are new since the last full or Delta Release. For
example, if the Release unit is the program, a Delta Release contains
only those modules that have changed, or are new, since the last full
release of the program or the last Delta Release of the modules - see
also 'Full Release'.
94
Glossary of Terms
Dependency The reliance, either direct or indirect, of one process or activity upon
another.
Depreciation The loss in value of an asset due to its use and/or the passage of time.
The annual depreciation charge in accounts represents the amount of
capital assets used up in the accounting period. It is charged in the cost
accounts to ensure that the cost of capital equipment is reflected in the
unit costs of the services provided using the equipment. There are
various methods of calculating depreciation for the period, but the
Treasury usually recommends the use of current cost asset valuation as
the basis for the depreciation charge.
Differential Charging business customers different rates for the same work, typically
charging to dampen demand or to generate revenue for spare capacity. This can
also be used to encourage off-peak or night time running.
Direct cost A cost that is incurred for, and can be traced in full to a product,
service, cost centre or department. This is an allocated cost. Direct
costs are direct materials, direct wages and direct expenses.
Disaster recovery A series of processes that focus only upon the recovery processes,
planning principally in response to physical disasters, that are contained within
BCM.
Discounted cash An evaluation of the future net cash flows generated by a capital project
flow by discounting them to their present-day value. The two methods most
commonly used are:
Discounting The offering to business customers of reduced rates for the use of off-
peak resources (see also Surcharging).
Disk cache Memory that is used to store blocks of data that have been read from
controller the disk devices connected to them. If a subsequent I/O requires a
record that is still resident in the cache memory, it will be picked up
from there, thus saving another physical I/O.
Duplex (full and Full duplex line/channel allows simultaneous transmission in both
half) directions. Half duplex line/channel is capable of transmitting in both
directions, but only in one direction at a time.
Echoing A reflection of the transmitted signal from the receiving end, a visual
method of error detection in which the signal from the originating device
is looped back to that device so that it can be displayed.
Elements of cost The constituent parts of costs according to the factors upon which
expenditure is incurred viz., materials, labour and expenses.
E d U Th h th i d t d b i
95
Glossary of Terms
External Target One of the measures, against which a delivered IT service is compared,
expressed in terms of the customer's business.
Forward Schedule Contains details of all the Changes approved for implementation and
of Changes their proposed implementation dates. It should be agreed with the
Customers and the business, Service Level Management, the Service
Desk and Availability Management. Once agreed, the Service Desk
should communicate to the User community at large any planned
additional downtime arising from implementing the Changes, using the
most effective methods available.
Full cost The total cost of all the resources used in supplying a service i.e. the
sum of the direct costs of producing the output, a proportional share of
overhead costs and any selling and distribution expenses. Both cash
costs and notional (non-cash) costs should be included, including the
cost of capital.
See also 'Total Cost of Ownership'
Full Release All components of the Release unit are built, tested, distributed and
implemented together - see also 'Delta Release'.
Gradual Recovery Previously called 'Cold stand-by', this is applicable to organisations that
do not need immediate restoration of business processes and can
function for a period of up to 72 hours, or longer, without a re-
establishment of full IT facilities. This may include the provision of
empty accommodation fully equipped with power, environmental
controls and local network cabling infrastructure, telecommunications
connections, and available in a disaster situation for an organisation to
install its own computer equipment.
96
Glossary of Terms
Hard charging Descriptive of a situation where, within an organisation, actual funds are
transferred from the customer to the IT organisation in payment for the
delivery of IT services.
Hard fault The situation in a virtual memory system when the required page of
code or data, which a program was using, has been redeployed by the
operating system for some other purpose. This means that another
piece of memory must be found to accommodate the code or data, and
will involve physical reading/writing of pages to the page file.
Host A host computer comprises the central hardware and software resources
of a computer complex, e.g. CPU, memory, channels, disk and magnetic
tape I/O subsystems plus operating and applications software. The term
is used to denote all non-network items.
Immediate Previously called 'Hot stand-by', provides for the immediate restoration
Recovery of services following any irrecoverable incident. It is important to
distinguish between the previous definition of 'hot stand-by' and
'immediate recovery'. Hot stand-by typically referred to availability of
services within a short timescale such as 2 or 4 hours whereas
immediate recovery implies the instant availability of services.
Impact analysis The identification of critical business processes, and the potential
damage or loss that may be caused to the organisation resulting from a
disruption to those processes. Business impact analysis identifies:
· the form the loss or damage will take · how that degree of damage or
loss is likely to escalate with time following an incident · the minimum
staffing, facilities and services needed to enable business processes to
continue to operate at a minimum acceptable level · the time within
which they should be recovered. The time within which full recovery of
the business processes is to be achieved is also identified.
Impact scenario Description of the type of impact on the business that could follow a
business disruption. Usually related to a business process and will
always refer to a period of time, e.g. customer services will be unable to
operate for two days.
Incident Any event which is not part of the standard operation of a service and
which causes, or may cause, an interruption to, or a reduction in, the
quality of that service.
Indirect cost A cost incurred in the course of making a product providing a service or
running a cost centre or department, but which cannot be traced
directly and in full to the product, service or department, because it has
been incurred for a number of cost centres or cost units. These costs
are apportioned to cost centres/cost units. Indirect costs are also
f dt h d
97
Glossary of Terms
Internal target One of the measures against which supporting processes for the IT
service are compared. Usually expressed in technical terms relating
directly to the underpinning service being measured.
Invocation (of Putting business recovery plans into operation after a business
business recovery disruption.
plans)
Known Error An Incident or Problem for which the root cause is known and for which
a temporary Work-around or a permanent alternative has been
identified. If a business case exists, an RFC will be raised, but, in any
event, it remains a Known Error unless it is permanently fixed by a
Change.
98
Glossary of Terms
Latency The elapsed time from the moment when a seek was completed on a
disk device to the point when the required data is positioned under the
read/write heads. It is normally defined by manufacturers as being half
the disk rotation time.
Logical I/O A read or write request by a program. That request may, or may not,
necessitate a physical I/O. For example, on a read request the required
record may already be in a memory buffer and therefore a physical I/O
is not necessary.
Marginal Cost The cost of providing the service now, based upon the investment
already made.
Maturity The degree to which BCM activities and processes have become
level/Milestone standard business practice within an organisation.
Operational Costs Those costs resulting from the day-to-day running of the IT Services
section, e.g. staff costs, hardware maintenance and electricity, and
relating to repeating payments whose effects can be measured within a
short timeframe, usually less than the 12-month financial year.
Operational Level An internal agreement covering the delivery of services which support
Agreement the IT organisation in their delivery of services.
Package assembly A device that permits terminals, which do not have an interface suitable
/disassembly for direct connection to a packet switched network, to access such a
device network. A PAD converts data to/from packets and handles call set-up
and addressing.
Page fault A program interruption that occurs when a page that is marked 'not in
l 'i f dt b ti
99
Glossary of Terms
Paging The I/O necessary to read and write to and from the paging disks: real
(not virtual) memory is needed to process data. With insufficient real
memory, the operating system writes old pages to disk, and reads new
pages from disk, so that the required data and instructions are in real
memory.
PD0005 Alternative title for the BSI publication 'A Code of Practice for IT Service
Management'.
Percentage The amount of time that a hardware device is busy over a given period
utilisation of time. For example, if the CPU is busy for 1800 seconds in a one hour
period, its utilisation is said to be 50%.
Physical I/O A read or write request from a program has necessitated a physical read
or write operation on an I/O device.
Prime cost The total cost of direct materials, direct labour and direct expenses. The
term prime cost is commonly restricted to direct production costs only
and so does not customarily include direct costs of marketing or
research and development.
Process Control The process of planning and regulating, with the objective of performing
the process in an effective and efficient way.
Queuing time Queuing time is incurred when the device, which a program wishes to
use, is already busy. The program therefore has to wait in a queue to
obtain service from that device.
100
Glossary of Terms
Reference data Information that supports the plans and action lists, such as names and
addresses or inventories, which is indexed within the plan.
Release A collection of new and/or changed CIs which are tested and introduced
into the live environment together.
Request for Form, or screen, used to record details of a request for a change to any
Change (RFC) CI within an infrastructure or to procedures and items associated with
the infrastructure.
Resource cost The amount of machine resource that a given task consumes. This
resource is usually expressed in seconds for the CPU or the number of
I/Os for a disk or tape device.
Resource profile The total resource costs that are consumed by an individual online
transaction, batch job or program. It is usually expressed in terms of
CPU seconds, number of I/Os and memory usage.
Resource unit Resource units may be calculated on a standard cost basis to identify
costs the expected (standard) cost for using a particular resource. Because
computer resources come in many shapes and forms, units have to be
established by logical groupings. Examples are:
a) CPU time or instructions b) disk I/Os c) print lines
d) communication transactions.
Resources The IT Services section needs to provide the customers with the
required services. The resources are typically computer and related
equipment, software, facilities or organisational (people).
Return to normal The phase within a business recovery plan which re-establishes normal
phase operations.
Risk Analysis The identification and assessment of the level (measure) of the risks
calculated from the assessed values of assets and the assessed levels of
threats to, and vulnerabilities of, those assets.
101
Glossary of Terms
Seek Time Occurs when the disk read/write heads are not positioned on the
required track. It describes the elapsed time taken to move heads to the
right track.
Self-insurance A decision to bear the losses that could result from a disruption to the
business as opposed to taking insurance cover on the risk.
Service Desk The single point of contact within the IT organisation for users of IT
services.
Service Level Written agreement between a service provider and the Customer(s),
Agreement that documents agreed Service Levels for a Service.
Service Level The process of defining, agreeing, documenting and managing the levels
Management of customer IT service, that are required and cost justified.
Service quality The written plan and specification of internal targets designed to
plan guarantee the agreed service levels.
102
Glossary of Terms
Soft fault The situation in a virtual memory system when the operating system
has detected that a page of code or data was due to be reused, i.e. it is
on a list of 'free' pages, but it is still actually in memory. It is now
rescued and put back into service.
Software Library A controlled collection of SCIs designated to keep those with like status
and type together and distinctly segregated, to aid in development,
operation and maintenance.
Software work Software work is a generic term devised to represent a common base on
unit which all calculations for workload usage and IT resource capacity are
then based. A unit of software work for I/O type equipment equals the
number of bytes transferred; and for central processors it is based on
the product of power and CPU-time.
Solid state devices Memory devices that are made to appear as if they are disk devices.
The advantages of such devices are that the service times are much
faster than real disks since there is no seek time or latency. The main
disadvantage is that they are much more expensive.
Specsheet Specifies in detail what the customer wants (external) and what
consequences this has for the service provider (internal) such as
required resources and skills.
Standard costing A technique which uses standards for costs and revenues for the
purposes of control through variance analysis.
Storage A defined measurement unit that is used for storage type equipment to
occupancy measure usage. The unit value equals the number of bytes stored.
103
Glossary of Terms
Terminal I/O A read from, or a write to, an online device such as a VDU or remote
printer.
Tree structures In data structures, a series of connected nodes without cycles. One
node is termed the root and is the starting point of all paths, other
nodes termed leaves terminate the paths.
Unit costs Costs distributed over individual component usage. For example, it can
be assumed that, if a box of paper with 1000 sheets costs £10, then
each sheet costs 1p. Similarly if a CPU costs £lm a year and it is used to
process 1,000 jobs that year, each job costs on average £1,000.
Utility cost centre A cost centre for the provision of support services to other cost centres.
(UCC)
104
Glossary of Terms
Variance analysis A variance is the difference between planned, budgeted or standard cost
and actual cost (or revenues). Variance analysis is an analysis of the
factors that have caused the difference between the pre-determined
standards and the actual results. Variances can be developed specifically
related to the operations carried out in addition to those mentioned
above.
Version Identifier A version number; version date; or version date and time stamp.
Virtual memory A system that enhances the size of hard memory by adding an auxiliary
system storage layer residing on the hard disk.
Vulnerability A weakness of the system and its assets, which could be exploited by
threats.
WORM (Device) Optical read only disks, standing for Write Once Read Many.
105
106