Вы находитесь на странице: 1из 43

Incident Management Process

Handbook

Griffith University
Version 1.7

Griffith University
Incident Management Process Handbook

Version History
Version
No

Issue Date

Nature of Amendment

Editor

0.3

July 2003

First draft

Patrick Keogh
(LucidIT)

0.4

8/3/2004

Updates to priority and response / resolution table.


Minor updates

John Scullen

0.5

22/3/2004

Updates from quality review walkthrough

John Scullen

0.6

24/3/2004

Updates suggested by ICTS-MT. Response


John Scullen
definition adjusted and diagram included. Response
times adjusted for priority 1 and 2. Hierarchical
escalation modified to ICTS-MT specifications.

0.7

6/4/2004

Minor updates resulting from final v0.6 distribution.

Merril Rogers

1.0

20/4/2004

Minor updates from Geoff Dengate. Approved by


Project Board

John Scullen

1.1

13/07/2006

Revision

Sanja Tadic, Judy


Bromage, Julie
Aslett

1.2

02/10/2006

Minor updates resulting from consultations with


Product and Service Managers

Sanja Tadic, Julie


Aslett

1.3

22/06/2007

Minor update

Sanja Tadic

1.4

29/09/2008

Minor update to document major outage procedure

Sanja Tadic, Naveen


Sharma

1.5

30/07/2010

Minor update to document procedure for incorrect


group assignment. Update to Incident Management
Quick Reference Guide Appendix A. Update to
Incident Categories Appendix B. Update of name
change from InfoServices to Library and IT Help.
References to EITS changed to CTS.

Felicity Berends
Sanja Tadic

1.6

30/4/2013

Minor updates of SDT name from


LibraryandITHelp@Griffith to Service Desk tool.
Minor corrections to role titles. Update to Quick
Reference Guide (Appendix A) and Incident
Categories spider chart (Appendix B)

Felicity Berends

Distribution
Ver #
0.4

Recipient
Andrew Bowness
Wendy Balachandran
Christine Schafer
Regina Obexer
Matt Maynard
Carol OFaircheallaigh
Carolyn Plant

Date issued
8/3/2004

Reason for distribution


Initial release prior to quality
review

ii

Griffith University
Incident Management Process Handbook

Karl Turnbull
Rowan Salt
Geoff Mitchell
Sandra Reis
0.6

ICTS-MT
Chris Walker
Paul Jardine

26/3/2004

Circulated following suggested


updates

0.7

Project Board

6/4/2004

Approval

1.0

ITIL Steering Commitee

21/4/2004

FYI

1.1

ITIL Steering Comittee

22/08/2006

Approval

1.1

INS Product and Service


Managers (currently using
LibraryandITHelp@Griffith)

12/09/2006
29/09/2006

FYI

1.5

INS Product and Service


Managers (currently using
LibraryandITHelp@Griffith)

31/08/2010

FYI

1,6

INS Product and Service


Managers (currently using the
Service Desk tool)

30/4/2013

FYI

iii

Griffith University
Incident Management Process Handbook

Print date

31 August 2010

Filename

335601873.doc

Author(s)

Patrick Keogh (Lucid IT Pty Ltd), Wendy Balachandran, John Scullen, Sanja Tadic,
Felicity Berends

iv

Griffith University
Incident Management Process Handbook

Table of Contents
1

Introduction......................................................................................3
1.1 Objective......................................................................................3
1.2 Scope...........................................................................................3
1.3 Document Structure.....................................................................3

The Incident Management Process...............................................5


2.1 Objectives of Incident Management............................................5
2.2 Scope...........................................................................................5
2.3 Incident Classification..................................................................5
2.4 Process Description.....................................................................8
2.5 Response and Resolution Times...............................................11
2.6 Escalation..................................................................................12
2.7 Priority 1 Incident Procedure.....................................................16

Roles and Responsibilities...........................................................19


3.1 Manager, Library and IT Help (Incident Management Process Owner)
19
3.2 Library and IT Help Team Leaders (Service Desk Manager)....19
3.3 First Tier (Service Desk) Support..............................................20
3.4 Resolution Groups.....................................................................21

Communication Framework.........................................................24
4.1 Communication Framework......................................................24
4.2 Relationship with Other Processes...........................................25
4.3 Incident Management & Problem Management........................25
4.4 Incident Management & Configuration Management................26
4.5 Incident Management & Service Level Management...............27

Performance Management............................................................28

Management Reports....................................................................29
6.1 Management Reports................................................................29

Process Review.............................................................................31

Appendix A: Incident Management Quick Reference Card...............32


Appendix B: Incident Categories.........................................................33
1

Griffith University
Incident Management Process Handbook

Appendix C: Abbreviations and Definitions........................................34


7.1 Abbreviations and Acronyms Used............................................34
7.2 Definition of Terms Used...........................................................34
Appendix D: Unsupported Functions..................................................37

Griffith University
Incident Management Process Handbook

1 Introduction
The main goal of Incident Management is to restore normal service operation as
quickly as possible and minimise the adverse impact on business operations. Service
Desk as the first point of contact for Information Services clients is the owner of the
Incident Management Process. The main objective of Service Desk is to facilitate the
restoration of normal operational service with minimal business impact on the client
and within agreed service levels and business priorities.
These procedures are to be used by all Information Services staff handling Incidents
within the scope defined in section 2.2 Scope.

1.1 Objective
Information Services aims to achieve the following objectives with the Incident
Management Process:
Capture information about an incident at the start of the process
Clients have confidence in Information Services capability
Consistent processes for clients
Have clear procedures for clients on how to get help
Better management of, and alignment with, client expectations
A well defined scope of the Service Desk role which is clearly communicated
Analysis of incidents which will contribute to a better understanding of the
underlying issues

1.2 Scope
This Handbook documents the Incident Management Process including:
-

The process flow


Roles and responsibilities
Communication framework
Performance Management
Management reports
Checklists and definitions

1.3 Document Structure


This document describes the Incident Management process for Information Services.
The document is organised as follows:
Section 2: The Incident Management Process Provides a description of the
Incident Management Process. Includes high level process description, classification, lead
times, and escalation matrices.

Griffith University
Incident Management Process Handbook

Section 3: Roles and Responsibilities The different roles within the Incident
Management Process are described. The responsibilities of each role are included.
Section 4: Communication Framework The communication aspects are
explained in more detail. Included is high level communication framework and the
relationship between Incident Management and other processes.
Section 5: Performance Management The Key Performance Indicators (KPIs)
for the Incident Management Process are described.
Section 6: Management Reports Information about the different management
reports is provided.
Section 7: Process Review Information about reviewing the Incident
management process.
The following Appendices are included at the end of the document:
Quick Reference Cards (QRC) for Service Desk and second and third tier support
groups.
-

Appendix B: Incident Categories Overview of the different Incident Categories


Explanation of abbreviations and definitions used in this document

Griffith University
Incident Management Process Handbook

2 The Incident Management Process


Incident Management is closely associated with Problem Management, providing
categorisation and reporting of all the Information Services related incidents that occur
within Griffith University, thus enabling root cause analysis.
The second and third tier support groups have a role within this process. The focus
must be on quick solutions and the time involved restoring service as quickly as
possible.
After resolution, a complete recording of all actions should be documented in the
Service Desk tool. This will facilitate faster response times for future incidents, and will
free up second and third tier support groups for more proactive problem solving.
This section provides information about the Incident Management process.

2.1 Objectives of Incident Management


The objectives of Incident Management are:
-

To restore normal service operation as quickly as possible; and


To minimise the adverse impact on business operations

2.2 Scope
An Incident is any event which is not part of the standard operation of the service and
which causes, or may cause, an interruption to, or a reduction of, the quality of the
service.
Only production systems, or systems connected to the production network, are covered
by the Incident Management process.
Excluded from the scope of this process are:
-

Development activities
Non-production activities (unless connected to the production network)
All unsupported functions as specified in Appendix D.

2.3 Incident Classification


Classification of incidents is based on two aspects:
1. Priority of an incident: Relating to the severity of an incident; and
2. Category of an incident: Relating to the configuration item causing the incident to
occur

Griffith University
Incident Management Process Handbook

2.3.1 Incident Priority


The priority of an incident is determined by:
1. Impact: Impact of the incident on the business. The number of clients or
importance of system affected. The hierarchical position of the client is included in
this variable.
2. Urgency: How severely the clients work process is affected. This influences the
timeframe that is allowed to resolve the incident.
The Impact/Urgency matrix, shown below, determines the priority of the incident.

Urgency

Impact
Low

Medium

High

Low

Medium

High

The assessment methodology for the impact and the severity is explained in more
detail in the sections below.

2.3.1.1 Impact
Incidents will be placed into High, Medium and Low impact categories. The key factor
in measuring impact is the impact the incident has on the business. Each incident
will be reviewed on a case-by-case basis with appropriate impact assessment and
approval based on the following criteria.
Impact

Description

High

Whole organisation affected;


Site or multiple sites affected;
Multiple groups of clients affected;
Critical business process interrupted; or
System-wide outages to Learning@Griffith, Staff portal, or Email

Medium

Group of clients, a Pro Vice Chancellor (PVC), or a member of the


Vice Chancellors (VCs) Office staff affected;
Non-critical business process interrupted.

Low

One client affected (other than VCs Office or PVCs)

Griffith University
Incident Management Process Handbook

2.3.1.2 Urgency
Incidents will be placed into High, Medium and Low urgency categories. The key factor
in measuring urgency is how severely the clients work process is affected. This
influences the timeframe that is allowed to resolve the incident. Each incident will be
reviewed on a case-by-case basis with appropriate severity assessment and approval
based on the following criteria.
Urgency

Description

High

Process stopped; client(s) cannot work

Medium

Process affected; client(s) cannot use certain functions

Low

Process not affected; change request, new/extra/optimised function

2.3.2 Incident Category


Incident categories have been established to:
-

To assist with the correct assignment of incidents.


To facilitate reporting on the incident and problem management process.
To identify priority areas for proactive problem management to focus on.
Incident categories are described in Appendix B: Incident Categories.

Griffith University
Incident Management Process Handbook

2.4 Process Description


1. Incident
Detected

The aim of the Incident Management process is


to provide a standardised, high quality service to
all clients who report incidents. This section
provides an overview of the Incident
Management process as pictured in the process
flow chart (right). The process flowchart
provides an overview of the Incident
Management
process.
The
Incident
Management process is managed through the
Service Desk tool.

2.4.1 Incident Detected


The process starts with the detection of an
incident. An incident can originate
from a situation experienced by a client and
reported to the Service Desk
from a technical malfunction, detected by
clients, Information Services staff or third party
vendors.
Note: The incident can be communicated to the
Service Desk via the telephone, digitally or faceto-face.

8. Monitoring and Tracking

Note: Service Desk tool is a Request for


Information, Incident, Problem and Change
Management System.

2. Acceptance,
Recording and
Classification

Service
Request ?

Activate
Change Management
process

Priority 1 ?

Activate
Priority 1 Incident
procedure

3. Initial Support

4. Investigation
and Diagnosis

5. Resolve Incident

Incident
Resolved ?

6. Incident
Escalation

7. Verify Resolution
and
Incident Closure

2.4.2 Acceptance, Recording and Classification


When the incident is reported to the Service Desk, a Service Desk staff member should
first determine whether it falls within the scope of Incident Management process. If
uncertain, this person should seek the advice of the Manager, Library and IT Help
(Incident Management Process Owner) or a Team Leader (Service Desk Manager) for
confirmation.
The Service Desk staff member is responsible for opening a new record in the Service
Desk tool and recording the incident details. The following is an example of information
that should be captured:
-

Client details such as name, location, phone number, email


Classification of the incident in terms of incident category and priority
Detailed description of the incident and affected Configuration Items (CIs)

Griffith University
Incident Management Process Handbook

The Service Desk staff member classifies the incident according to impact and urgency
of the incident. Refer to the Impact/Urgency Matrix in 2.3.1 to help determine the
priority of the incident.
If an issue can be identified as a Request for Information (RFI) or a Request for
Change (RFC), the RFI & RFC process gets activated.

2.4.3 Initial Support


The Service Desk provides the initial support to solve the incident and, based on time
and knowledge, determines whether the incident can be solved.
If the incident cannot be solved by Service Desk, an incident reference number and a
response time (see section 2.5 Response and Resolution Times) will be provided.
If the Service Desk recognises incidents with similar symptoms, which have been
recently recorded, then Incident Matching can occur. Incident Matching is where similar
incidents are grouped together to reflect that there may be a larger problem. This
linking assists with Problem Management activities and, when a solution is found, it can
easily be transferred to all the incidents grouped together. The Service Desk tool
parent/child facility is used to record these matches/incidents.
If the incident cannot be linked to an existing problem, the Service Desk staff should
search Service Desk tool for similar incidents or the Knowledge Base for a resolution.
If a resolution is identified it can be applied to the incident. If not, additional
investigation and diagnosis should be carried out for an incident resolution.

2.4.4 Investigation and Diagnosis


The Service Desk staff member should first make an attempt at analysing the incident,
in search for a solution. This person can use the following information:
-

Own experience and knowledge


Service Desk tool information
Knowledge Base entries
Procedure manuals & other relevant documentation
Technical information from the Internet
Knowledge and experience of colleagues
Any additional information acquired which will be useful for resolving the incident
should be recorded in the Service Desk tool.
If a solution cannot be provided by Service Desk, it is then escalated to the relevant
support groups (second, third or vendor) for investigation and diagnosis to find and
implement a solution/workaround for the incident. The relevant support group updates
the incident record with the solution/workaround.
Note: If a solution cannot be provided, the solution should be identified as a possible
entry for inclusion into the Knowledge Base, which will assist future diagnoses.

2.4.5 Resolve Incident


If a resolution of the incident can be found and implemented within SLA time by the
relevant second, third or vendor support groups, the incident will be given a status of
9

Griffith University
Incident Management Process Handbook

resolved. This will trigger an auto-generated email informing the client that their
incident has been resolved.
Part of this activity is updating Service Desk tool with the resolution information.
Accurate and complete recording of details is very important and is a requirement of all
parties involved in the Incident Management process. Quality information is critical for
future incident handling and restoration of normal business activities.

2.4.6 Incident Escalation


If an incident cannot be resolved by the Service Desk staff, the incident should be
transferred to second or third tier support groups. Service Desk remains responsible for
the incident until it is transferred to a second tier support group.
Transfer means involving second or third tier support groups in the resolution of the
incident. When incidents are transferred, the assigned owner of the incident is
responsible for keeping the client informed of progress. This is done by any
appropriate means: face to face, telephone, manual notification from Service Desk tool
or email. In addition, all the relevant service groups are responsible for ensuring all
activity relating to an incident is suitably annotated.
There may be occasions when an incident is escalated to an incorrect group. It is the
assigned groups responsibility to transfer the incident to the correct group and provide
information to the analyst who initially assigned the incident incorrectly to enable them
to assign incidents of this nature correctly in the future. If the incorrectly assigned group
does not know which group to correctly assign the incident to it is appropriate to
transfer the incident to Service Desk for further investigation.
When Priority 1 and 2 incidents are assigned or transferred, the analyst assigning or
transferring should make a phone call to advise the new assigned owner that priority 1
(or 2) incident has been assigned or transferred to them.
Depending on the priority, hierarchical escalation might take place as well. Hierarchical
escalation (awareness) means that higher levels of management are involved when
there is a threatened breach of service levels or additional authorisation is required for
incident resolution.
Explanation of incident notification and escalation process is given in section 2.6.1.
Note: Updating the Internal Notes does not generate an email from Service Desk tool.
Only confirmation, requestor communications, resolution and closure emails are sent to
the client.

2.4.7 Verify Resolution and Incident Closure


The client will be informed once the incident is resolved. The client has three business
days (Monday to Friday, excluding holidays) to confirm that the incident can be closed.
If the client does not agree that the incident has been resolved, it will be reopened and
the process will return to Investigation and Diagnosis.
If client does not respond to the email requesting incident closure, the incident will
automatically be closed after three business days and the status changed to closed.
If the client responds after the incident closure, a new incident record will be created.
Incidents with status of closed should not be reopened. A new incident should be
created that refers to the original Incident record.
10

Griffith University
Incident Management Process Handbook

2.4.8 Monitoring and Tracking


The end to end progress of the incident is monitored and communicated to the client
when necessary. The Service Desk tool is updated each time the status of the incident
changes. Clients can monitor the status of their Incidents via Service Desk tool
accessible from Staff portal.

2.5 Response and Resolution Times


Response Time is the elapsed time between when the incident is recorded and when
work commences on investigation, diagnosis and resolution of the incident. The
response time can be utilised to
-

Research a solution
Mobilise a priority team
Request further details from the client
Advise action taken and provide an indication of the resolution time if required
Resolution Time is the target time for a resolution to an incident to be implemented.
Information Services aims to resolve at least 80% of incidents inside the resolution
times specified. This will allow for exceptional circumstances that cannot be met using
the standard times. Official closure of the incident is dependent on the approval of the
client. Solution time therefore does not include the time taken for the client to contact
Information Services to give approval, as this could happen some time later.

Detection &
report to
Service Desk

Start Repair
Diagnosis

Finish Repair

Recovery

Incident

Incident

Detection
time

Response time

Repair time

Recovery
time

11

Griffith University
Incident Management Process Handbook

The following table provides an overview of response and resolution times for
incidents. The times listed below relate to response and resolution times of the support
groups involved during standard business hours, Mondays to Fridays. Time required by
external support service providers (non Information Services) or purchasing time
(should there be the need for the acquisition of parts and/or materials) is excluded.

Response Time

Resolution Time

Priority 1

30 minutes

4 hours

Priority 2

1 hour

8 hours

Priority 3

4 hours

12 hours

Priority 4

1 day

3 days

Priority 5

2 days

5 days

2.6 Escalation
Escalation can take place in two ways:

Vertical Escalation
Hierarchical Escalation

1. Functional escalation This is


escalation to another support group in
order to solve the incident.
2. Hierarchical escalation This is
escalation in order to inform the right
(management) level within Information
Services for communication purposes and
in order to free up the necessary
resources to solve the incident.

Horizontal Escalation
Functional Escalation

Incident Notification and Escalation


Notification of Incidents occurs at defined times. Notification ensures that:
The Business (including management) and clients are kept informed of the
occurrence and progress towards resolution of an incident;
Swift action is taken to resolve the incident;
Management provide necessary resources to resolve the incident.

12

Griffith University
Incident Management Process Handbook

Notification is generated, depending on the priority, when:


Higher priority of incident is recorded in the Service Desk tool - certain
management levels are notified;
Incident is assigned or transferred to a group using the Service Desk tool;
75% time has lapsed since Incident was recorded & updated;
SLA is breached.
Table A details when internal notification occurs for 75% resolution time elapsed
and SLA breaches for all priorities
Table B details when additional internal and external notifications for each
priority occur
Table A
Event & Priority
75% of target
resolution time
elapsed
1 = 3 hours

Notification Sent to:


The Analyst the Incident is assigned to
Group Manager of the assigned group

Method
Email generated by Service
Desk tool

Category Owner of the Incident Category

2 = 6 hours
3 = 9 hours
4 = 2.25 days
5 = 3.75 days
100% - SLA
Breached
1 = 4 hours

The Analyst the Incident is assigned to


Group Manager of the assigned group

Email generated by Service


Desk tool

Category Owner of the Incident Category

2 = 8 hours
3 = 12 hours
4 = 3 days
5 = 5 days
Weekly

Report of all Incidents recorded in this


period, including breaches sent to Group
Managers

Reports generated from


Service Desk tool

Note: 1 working day = 9 hours, 8am 5pm. The clock does not continue outside of
these hours for the purpose of escalation notification.

13

Griffith University
Incident Management Process Handbook

Table B
Priority
1&2

After
Immediately after
an incident has
been assigned to a
group or
transferred to
another group

Notification sent to:

Method

Group Manager of the


assigned group

Email generated by Service


Desk tool

Category Owner of the Incident


Category

SDT Announcement is
created to notify INS staff
using SDT and or Griffith
University Community

Incident Management Process


Owner (Manager Library and IT
Help)
Service Desk Manager (Library
and IT Help Management
Team)
The Analyst the Incident is
assigned to

30 minutes
(if Incident has not
been responded to
or appropriately
updated)

Product Manger, Team Leader


or Duty phone of the assigned
group

Analyst who assigns or


transfers the incident to
another support group
notifies the group via
telephone.

All INS Directors & Associate


Directors

Email generated by Service


Desk tool

PVC (INS)
Group Manager of the
assigned group
Category Owner of the Incident
Category
Incident Management Process
Owner (Manager Library and IT
Help)
Service Desk Manager (Library
and IT Help Management
Team)
The Analyst the Incident is
assigned to

Business/Clients

Library and IT Help phone


greeting may require
updating to include
information relating to the
Incident.
SDT Announcement is
created to notify INS staff
using SDT and or Griffith
University Community

1 hour (if Incident

The Analyst the Incident is

Email generated by Service

14

Griffith University
Incident Management Process Handbook

Priority

After
has not been
responded to or
appropriately
updated)

Notification sent to:


assigned to

Method
Desk tool

Team Leader of the assigned


group
Business/Clients

SDT Announcement is
updated with a progress
report of the Incident.
Update Library and IT Help
phone greeting to keep
clients informed on
progress of the incident.
Email from the relevant
Director/PVC (INS) might
be sent to all staff and all
students

Priority
1

After
Hourly (if Incident
has not been
responded to or
appropriately
updated)

Notification sent to:

Method

Team Leader of the assigned


group

Email generated by Service


Desk tool

Business/Clients

SDT Announcement is
updated with a progress
report of the Incident.
Update Library and IT Help
phone greeting to keep
clients informed on
progress of the incident
Update email from the
relevant Director/PVC
(INS) might be sent to all
staff and all students

Immediately after an
incident has been
assigned to a group
or transferred to
another group

Product Manager, Team


Leader or Duty phone of the
assigned group

Analyst who assigns or


transfers the incident to
another support group
notifies the group via
telephone.
SDT Announcement is
created to notify INS staff
using L&ITH@G and or
Griffith University
Community

30 minutes after an

Team Leader of the assigned

Email generated by Service

15

Griffith University
Incident Management Process Handbook

Priority

Notification sent to:

After
incident has been
assigned to a group
or transferred to
another group

group

Method
Desk tool

Incident Manager Process


owner
Service Desk Manager
(Library and IT Help
Management Team)

1 hour
(if Incident has not
been responded to
or appropriately
updated)

All INS Directors & Associate


Directors

Email generated by Service


Desk tool

PVC (INS)
Group Manager of the
assigned group
Category Owner of the
Incident Category
Incident Management
Process Owner (Manager
Library and IT Help)
Service Desk Manager
(Library and IT Help
Management Team)
The Analyst the Incident is
assigned to

2 hours
(if Incident has not
been responded to
or appropriately
updated)

3-5

Immediately after an
incident has been
assigned to a group
or transferred to
another group

Business/Clients

Library and IT Help phone


greeting may require
updating to include
information relating to the
Incident.

The Analyst the Incident is


assigned to

Email generated by Service


Desk tool

Team Leader of the assigned


group
All INS Directors & Associate
Directors
The Analyst the Incident is
assigned to

Email generated by Service


Desk tool

Group Manager of the


assigned group

16

Griffith University
Incident Management Process Handbook

2.7 Major Outage Procedure


A major outage is an incident that results in significant disruption to Griffith University
staff and students.
It impacts majority of university enterprise systems and majority or all of the university
clients. It is important to note that although major outage is classified as a Priority 1
incident not all priority one incidents are necessarily major outages. Please refer to 2.3
Incident Classification for further clarification.

2.7.1 Major Outage Procedure during Business Hours


Whenever a major outage occurs a Priority 1 incident will be recorded and escalated
immediately for:
Functional escalation to the relevant specialist group (2nd or 3rd tier support) to
resolve the Incident
Hierarchical escalation to the Manager, Library and IT Help (Incident Management
Process Owner) for awareness and relevant communication to the business
The Manager, Library and IT Help (Incident Management Process Owner) or relevant
director/associate director will coordinate the communication within Information
Services about the major outage and liaise with other product and service managers to
ensure that adequate resources will be made available to resolve the Priority 1 incident
as soon as possible.
If the incident has not been responded to or appropriately updated within 30 minutes,
the Manager, Library and IT Help (Incident Management Process Owner) or relevant
director/associate director will liaise with the responsible Product Service Manager/s to
ensure that a Priority Team which includes all technical specialists is mobilised. The
Priority Team is responsible for:
-

Being a communication contact point

Ongoing communication with the relevant business units/managers about the


status of the incident (a status update will be provided every hour to the business)
Resolution of the incident
Continuous information, as the situation changes, to the PVC (Information
Services), Director and Associate Directors and the Service Desk about the status of the
incident
A detailed report about the cause and resolution of the major outage after the
resolution and closure of the incident.

2.7.2 Major Outage Procedure outside Business Hours


Recording, resolution and communication of major Outages outside business hours is
responsibility of appropriate on call team. On call team will advise the Product and
Service Manager responsible for that service who will contact relevant associate
Director/Director. It is responsibility of Associate Director/Director to advise Pro Vice
Chancellor, Information Services.
If major outage is likely to continue into business hours Manager, Library and IT Help
(Incident Manager) needs to be advised to take over communication with clients and
Information Services staff with a Priority Team.
17

Griffith University
Incident Management Process Handbook

2.7.3 Roles and Responsibilities


Staff who will normally be involved in the resolution of major outage, coordination of
resolution process and communication process are:
Incident Manager
PVC, Director, Associate Director
Product Service Manager
Technical specialist
On Call Staff
Vendor staff

2.7.4 Major Outage Communication Guidelines


2.7.4.1 Purpose
The purpose of communication during major outage is to immediately investigate and
confirm the impact and severity of the Incident. It should also confirm that the Incident
is major outage and an emergency situation.

2.7.4.2 Frequency
The frequency of the communication will be determined by the impact and severity of
the outage. Frequency of communication for priority one and two incidents is outlined in
the section 2.6.1 Incident Notification and Escalation, Table B above.

2.7.4.3 Content
The content will depend on the audience. Communication with the staff who are tasked
with the resolution of the incident will be internally focussed, detailed and will contain
clear actions and timelines. Communication with clients will focus on the impact and
what is being done to minimise the impact and resolve the incident. It should not
contain technical descriptions or detailed information about internal processes.
The content will focus on:
-

The nature and extent of the outage


Assessment of the impact
High level overview of actions taken to resolve the incident
Estimated resolution time
Confirmation that the incident has been resolved

2.7.4.4 Communication channels


Every possible communication channel available at the time of the major outage should
be used to communicate with the clients and Information Services staff:
-

Email
Web page
Phone
Public Announcement System
18

Griffith University
Incident Management Process Handbook

Face to face (response team meetings, meetings with staff tasked with the
resolution of Incident, meetings with Information Services staff impacted by the incident ,
CTS Team Leaders informing staff in schools, Library staff informing clients in the Library,
etc.)
Printed notices
Notice boards
SMS message to key stakeholders advising them who to contact for more
information or updates
-

19

Griffith University
Incident Management Process Handbook

3 Roles and Responsibilities


3.1 Manager, Library and IT Help (Incident Management Process Owner)
The Manager, Library and IT Help (Incident Management Process Owner) has
responsibility for the Incident Management process.
The Manager, Library and IT Help (Incident Management Process Owner) has the
following responsibilities:
-

Monitors Incident Management process


Determines scope of the Incident Management process
Establishes Incident Management procedures
Establishes prioritisation and escalation criteria
Monitors incident escalations
Establishes links to other service management disciplines
Monitors trends and takes appropriate action
Produces high level management reports about Incident Management
Reviews Service Desk procedures
Organises reviews and audit of process
Initiates improvement programmes
Liaises with Library and IT Help Team Leaders (Service Desk Manager)
Liaises with Problem Manager
Produces information for clients
Liaises with Change Manager and Service Level Manager over proposed changes

3.2 Library and IT Help Campus Coordinators (Service Desk Manager)


All Library and IT Help Campus Coordinators (Service Desk Manager) have the
responsibility for the Service Desk tool and act as the line managers of Service Desk
staff.
The Library and IT Help Campus Coordinators (Service Desk Manager) have the
following responsibilities:
Monitor the quality of delivered services
Determine the organisation, structure and scope of the Service Desk in
consultation with the Manager, Library and IT Help (Incident Management Process Owner)
Manage Service Desk staff
Review staffing levels
Review skill requirements
Organise training
Coordinate management reporting about Service Desk function (includes process
reporting)

20

Griffith University
Incident Management Process Handbook

Are responsible for internal and external communication (clients, Information


Services, Service Desk)
Liaise with Product Managers
Liaise with the business
Promote Service Desk
Establish links to service management processes
Organise client satisfaction surveys (together with process owners)
Liaise with other support teams providing technical resources
Participate in service desk tool selection, tailoring and installation
-

3.3 First Tier (Service Desk) Support

Service Desk provides first tier support. First tier support is responsible for:
Incident and service request registration
Initial support and classification
Resolution and recovery of incidents
Escalation of incidents when necessary
Monitoring of the status and progress toward resolution of all open incidents
Following up on behalf of clients about progress towards resolution
Monitoring of response and resolution times
Closure of incidents (This process has been automated. Please refer to 2.4.7)
Keeping affected clients informed about progress
Quality checking of closed incidents
Identifying the need for and creating/editing of Knowledge Base documents to
assist with a timely response for future incidents
(Pro-) Active relationship management with the clients
Communication with clients about Information Services issues and service
requests (status updates)
Advising and assisting Information Services clients to make best use of services
provided by Library and IT Help
Encouraging use of self-help resources
Note: Library and IT Help is Information Services product/service line that provides first
tier support for the majority of Information Services products and services.
For the purpose of Incident Management Process, Library and IT Help is defined as the
Service Desk. However, all groups involved in the resolution of Incidents and using
Service Desk tool have the responsibility to follow Service Desk processes. For
example, all groups using Service Desk tool should record Incidents they detect and
those Incidents reported to them by other groups within Information Services that are
not currently using Service Desk tool.
21

Griffith University
Incident Management Process Handbook

3.4 Resolution Groups (Second or Third Tier Support Groups)


These groups comprise technical specialists who hold strong relationships with other
areas within Information Services and have good skills in analysing incidents and
problems.
For Incident Management, resolution groups are of two types:
-

On site support
Technical support
with the following responsibilities:

3.4.1 On Site Support (Second or Third Tier Support Groups)


On site support is responsible for:
Resolution and recovery of incidents that need support at a physical location
Escalating incidents where necessary (Hierarchical escalation)
Escalation to another support group if necessary (Functional escalation)
Resolution and recovery of assigned Incidents
Monitoring the status and progress towards resolution of all open Incidents
assigned to their group
Communicating solutions and workarounds to Library and IT Help (Service
Desk/First Level Support) to assist in Incident classification, initial support and escalation
Promotion of the Library and IT Help (Service Desk/First level Support) and
Information Services on location (e.g. Informing clients of correct channels for reporting
incidents and how to obtain updates about outages and the status of requests)
Keeping clients informed about the status of the Incident assigned to their group
Accurate and complete recording of Incidents from and to other internal groups
within Information Services, using the Service Desk tool
Accurate and complete updating of activities and steps taken to resolve the
Incident.

3.4.2 Technical Support (Second or Third Tier Support Groups)


Technical support is responsible for:
Escalating service requests where necessary
Resolution and recovery of assigned incidents
Monitoring the status and progress toward resolution of all open incidents assigned
to their group
Monitoring tasks of servers and network components and applications
Keeping affected clients informed about progress
Escalation to another support group if necessary
Communicating solutions and workarounds to On Site Support and Library and IT
Help (Service Desk/First Tier Support)
Communicating incidents generated by monitoring tools to Library and IT Help
(Service Desk/First Tier Support)
22

Griffith University
Incident Management Process Handbook

Providing monitoring and diagnosis tools to Library and IT Help (Service Desk/First
Tier Support)
Creating, knowledge base documents and providing relevant training to Service
Desk staff
Providing any other information to Library and IT Help (Service Desk/First Tier
Support) to assist in incident classification, initial support and escalation
Accurate and complete recording of Incidents from and to other internal groups
within Information Services, using the Service Desk tool
Accurate and complete updating of activities and steps taken to resolve the
Incident.
Procedural Activities

Defined Procedural Roles


ARCI MATRIX

Incident submitted to Service


Desk
Incident detection and recording
Incident process
Request for Information process
Give the client a reference
number
Initial support and classification
Escalation to right support
group
Communicate status updates to
client
Investigation and diagnosis
Escalate using escalation
procedure
Resolution and recovery
Client approval of solution
Closure

1st
Tier

2nd
Tier

3rd
Tier

R
R
R

R
R
R

R
R
R

A
A
A

I
I

Service
Incident
Group
Desk
Management Managers
Manager
Process
Owner

Client

R,C

R,C

R,C

R,C

I,C

R,C

R,C

C,R

R
R
R

R,C
R
I

R,C
R
I

C,R
I
A

A
R
I

R
I

I
A
R

Explanation of Roles
Accountable

The person in this process who has the accountability for ensuring the overall
process is available, understood and performed correctly

Responsible

The person(s) who are expected to perform the prescribed activity, resolve and/or
escalate the related issues. Multiple levels within the matrix can do this

Consulted

The person(s) who are consulted before decisions are made or implementations
carried out

Informed

The person(s) who need to be informed about the prescribed activity

23

Griffith University
Incident Management Process Handbook

3.4.3 Product and Service Managers


Product and Service Managers are responsible for:
Monitoring the status and progress toward resolution of all open incidents assigned
to their group
Ensuring that adequate resources are available for efficient and effective resolution
of incidents assigned to their group
Monitoring of, and adherence to, response and resolution times of incidents
assigned to their groups
Monitoring client feedback for their product and service group and following up on
negative client feedback
Ensuring that adequate resources are available to resolve Priority 1 incidents
assigned to their groups as soon as possible
When Priority 1 incident has not been resolved within 30 minutes liaising with the
Manager, Library and IT Help (Incident Management Process Owner) about the situation and
mobilisation of a Priority Team, which can include technical specialists from all product and
service groups.

24

Griffith University
Incident Management Process Handbook

4 Communication Framework
This section consists of two parts:
1. Communication framework; In which the different communication lines between
Information Services and the business are described on an operational, tactical and
strategic level
2. Relationship with other processes; The input and output flows between Incident
Management and the other processes (Problem, Change, Configuration and
Service Level Management) are detailed in this section.

4.1 Communication Framework


The Communication Framework provides a high level overview of the different
communication lines between Information Services and the business on an operational,
tactical and strategic level. The communication framework for the Incident
Management process is shown below.

Strategic
Busine
ss
Directo
r

Exception Reporting
KPIs
INS Strategic Plan
Griffith Strategic Plan

Exception Reporting
KPIs
INS Strategic Plan
Griffith Strategic
Plan
INS Budget
SLAs

Business
Needs
SLAs

Business Manager

Tactical
Exception Reporting
Service Levels
Service Catalogue
SLA Reporting
Request for Service Outside SLA
Client Satisfaction Survey

Client Needs
Satisfaction Survey
Request for Service outside
SLA
SLAs

Operational
End User

PVC
(INS)
DSD

Incident Reporting
Incidents
Service Catalogue
Outages
Request for Service Outside SLA
Client Satisfaction Survey
Service Delivery News
Training

Incident Problem Manager

Exception Reporting
KPIs
INS Strategic Plan
Griffith Strategic Plan
Request for Service Outside
SLA
SLAs
Incident Reporting
Service Desk

25

Griffith University
Incident Management Process Handbook

4.2 Relationship with Other Processes


The following picture provides an overview on how Incident Management fits into the
Service Support processes as described by ITIL.

Incident
Management

RFC

Service Desk

Service Level
Management

Incident

Problem
Management

Incident

IT Operations

Configuration
Management

RFC

Change
Management

The main input and output flows between Incident Management and the other
processes that will be implemented at Griffith University. (Problem, Change
Configuration and Service Level Management) are detailed in this section.

4.3 Incident Management & Problem Management


4.3.1.1 Input
Information needed from the Problem Management process by Incident Management
includes:
-

Resolutions for Incidents


Workarounds for Incidents
Knowledge Base (known errors, existing resolutions, accepted workarounds)

4.3.1.2 Output
Information provided to Problem Management by Incident Management includes:
error)
-

Incident details (Affected systems, affected clients, classification, details of the


History of occurred incidents
Proposed workarounds for incidents
Proposed solutions for incidents

26

Griffith University
Incident Management Process Handbook

4.3.2 Incident Management & Change Management


4.3.2.1 Input
Information needed from the Change Management process by Incident Management
includes:
-

Change schedule
Status update of scheduled changes
Result of implemented changes (history)

4.3.2.2 Output
Information provided to Change Management by Incident Management includes:
-

Accepted RFCs
Advice on incidents resulting from an implemented change (feedback)

4.4 Incident Management & Configuration Management


4.4.1.1 Input
Information needed from the Configuration Management process by Incident
Management includes:
-

Details of Configuration Items (CIs)


Relationships between CIs
Service levels for CIs
Service contact details

4.4.1.2 Output
Information provided to Configuration Management by Incident Management includes:
-

Errors or discrepancies in Configuration Management Data Base (CMDB)


Relationship between incidents and Cls

27

Griffith University
Incident Management Process Handbook

4.5 Incident Management & Service Level Management


4.5.1.1 Input
Information needed from the Service Level Management process by Incident
Management includes:
-

Service Levels / KPIs


Business priority escalations
Service Catalogue
Client satisfaction / feedback about the Incident Management process
Communication about new services

4.5.1.2 Output
Information provided to Service Level Management by Incident Management includes:
-

Incidents outside Service Level Agreement (SLA)


Requests for service outside SLA (new ad hoc business requirements)
Client satisfaction
Exceptions to SLAs
Escalation of priority calls
Process information (management reporting)
KPI reporting

28

Griffith University
Incident Management Process Handbook

5 Performance Management
The following Key Performance Indicators (KPIs) have been set for the Incident
Management process:
-

80% of incidents responded to within SLA (response time)


80% of incidents resolved within SLA (resolution time)
100% of non-pending incidents must have updated activity log < 2 days old
90% of incidents to follow predefined Incident Management process
The following Key Performance Indicators (KPIs) have been set for the Service Desk:

scale)
received
-

95% of calls to be answered within 10 seconds


Client Satisfaction Survey to return average rating of 4 or higher (on a 5 point
100% of e-mailed incidents to be recorded within 24 business hours after email
80% of incidents solved at 1st tier

29

Griffith University
Incident Management Process Handbook

6 Management Reports
In this section the management information provided by the Incident Management
process is specified.

6.1 Management Reports


Management reporting takes place on daily, weekly and monthly basis about the
following subjects:
Daily

Weekly

Monthly

All Exceptions

All Exceptions

All Exceptions

Critical Issues

Incident Summary

Availability of services

Open Incident

Current Number of Users of Service


Desk tool

Closed Incident

Implemented Improvements

New Incident

Incidents Summary Rolling Trend

Group Breakdown

Incident Reporting, Incidents by:

Priority 1 Incident

Status

SLA Exceptions

Category

Priority

Service Group

Incident Summary (Detailed KPIs)


Performance Against SLA
Priority 1 Incidents
Recommendations
Top 10 Service Desk tool users

Management reports are submitted:


On a daily basis to all staff involved in the Incident Management process
On a weekly basis to Information Services management
On a monthly basis to Information Services Management, the Library and IT Help
Team Leaders (Service Desk Manager) and the Service Level manager
These management reports are used to monitor the success of the Incident
Management process and to identify any problems with the process.

30

Griffith University
Incident Management Process Handbook

6.1.1 Incident Management Process Reports


The following reports will be available for the Incident Management process.

Metric

Metric Use

Total number of incidents


recorded in the Service Desk
tool (open and closed)

Gives an indication of the overall workload of the Information Services


staff.

Total number of incidents


recorded per Information
Services service group (open
and closed)

Gives a breakdown of incidents logged to show which departments


are requiring the most support. Further analysis can be carried out to
drill down into problem departments to identify key groups that need
more assistance than average.

Number of incidents
recorded per category (open
and closed)

Helps show which parts of the infrastructure are creating the most
incidents. Useful to identify areas that could require detailed analysis
to remove common problems.

Percentage and number of


incidents resolved within
service level times

An important measure that indicates the level of service that is being


provided to the clients of Information Services.

Percentage of incidents per


priority code

Will show the workload per priority code and hence service level. This
data can be used to determine staffing levels, costs of services or the
review of the priority codes defined.

Percentage of incidents
resolved at first, second and
third tier.

Shows where the support work is taking place. It can be a useful


metric especially for the Service Desk as they take work away from
second and third level freeing them up for more pro-active tasks.

Number of incidents
assigned per Information
Services group/staff member
(open and closed)

Show the amount of work that different staff members and groups are
processing

Apart from the specific metrics above the following reports can be made available.
Results of client satisfaction surveys
A summary of any major outages, the actions taken to fix the outage and steps to
ensure that this will not occur again.

31

Griffith University
Incident Management Process Handbook

7 Process Review
In order to maintain continuous improvement of the Incident Management process an
ongoing review is essential. Detailed Incident Management review should be
undertaken on a six monthly basis.
These reviews should take place to ensure quality is maintained or improved. The
following steps should take place for each review.
Gather data and information from the Service Desk tool, Client Satisfaction
Surveys and input from Incident Management staff. Input for the process review should be
pro-actively sought from the Information Services staff by the Library and IT Help Manager
(Incident Management Process Owner).
Analyse the data looking at areas such as, client satisfaction, suggestions from
staff and process metrics. Some specific metrics to look at include those listed in Sections 5
Performance Management" and 6 Management Reports and the following list.
Correct logging of incident data in terms of incident categories, priority and specific
information relating to the incident
Review of incident Management reports such as performance against service
levels.
State of the Incident Management Process Handbook (Up to date or awaiting
review)
Problems identified with the process
From the above analysis identify improvement opportunities and put forward a
report detailing the suggestions. This report should be submitted to the Information Services
Management.
Following approval for improvements documented in the above mentioned report a
Process Improvement Plan should be formulated for improvements.
Changes to the Incident Management process should be authorised by the
Change Management processes. Once authorised implementation of changes should occur.
Following an improvement project a Post Implementation Review should be carried
out. This review should look at all the reports analysed before to identify that improvements
have made a positive change.

32

Griffith University
Incident Management Process Handbook

Appendix A: Incident Management Quick


Reference Card

33

Griffith University
Incident Management Process Handbook

Appendix B: Incident Categories


Categories updated on continuous basis to reflect categories in Service Desk tool

34

Griffith University
Incident Management Process Handbook

35

Griffith University
Incident Management Process Handbook

8 Appendix C: Abbreviations and Definitions


8.1 Abbreviations and Acronyms Used
Abbreviation

Definition

INS

Information Services

ITIL

Information Technology Infrastructure Library

RFC

Request For Change

SLA

Service Level Agreement

8.2 Definition of Terms Used


Term

Definition

Business Manager

A person authorised to make decisions on behalf of an organisational


unit concerning a service and its associated service levels.

Change

Any action either physical or procedural which modifies or impacts the


production environment

Change Management

The management and control of changes to the production


environment, in order to minimise the impact of change-related
problems.

Change Manager

The person responsible for processing change requests, chairing


Change Advisory Board meetings, coordinating changes and reporting
change activity to management

Classification

Determining the value of items by placing them in a certain order on


the basis of category, impact, and severity. It can be used to support
decisions concerning priorities.
A component of an IT infrastructure. CIs may vary widely in
complexity, size and type from an entire system (including all
hardware, software and documentation) to a single software module
or a minor hardware component.

Configuration Item (CI)

Configuration
Management

The process of identifying and defining the CIs in a system, recording


and reporting the status of CIs, and verifying the completeness and
correctness of CIs.

Incident

ITIL Definition: Any event that deviates from the standard and
expected operation of an IT system or service.

Incident Management

Incident Recording

Further Description: An incident can be seen as a client requesting


help for something that is not working. For example I cant print, I
cant access the Internet. In any situation where something does not
work and the specific details are not known it is an incident.
The process that has as primary focus to restore normal service
operation as quickly as possible and minimise the adverse impact on
business operations
The quality recording of incidents in such a way that other activities
and increased service provision is possible.

36

Griffith University
Incident Management Process Handbook

Term

Definition

Incident Reporting
Information Services
(Information Services)

The reporting of incidents, requests by clients and/or support groups.


Information Services encompasses the supporting technologies and
infrastructure on which the systems are run.

Information Technology
Infrastructure Library
(ITIL)

ITIL is a non-proprietary framework tailored to the operation of the IT


infrastructure developed by the UK Office of Government Commerce.
It is a set of comprehensive, consistent and coherent codes of best
practice for IT Service Management.

Information Services

Element of Griffith University that encompasses the supporting


technologies and infrastructure on which the systems are run and
services provided.

Information Services
infrastructure

The sum of an organisations IT-related hardware, software, data


communication facilities, procedures, and people.

Information Services
service

A described set of facilities, II and non IT, supported by the Information


Services (service provider) that fulfils one of more needs of the client,
that supports the client

Key Performance
Indicator (KPI)

Key Performance Indicators are clearly defined objectives with


measurable targets, set to judge process performance

Known Error

ITIL Definition: The successful diagnosis of the root cause of a


Problem (i.e. the specific infrastructure component at fault has been
identified).

Problem

Problem Management

Further Description: A known error is logged when the specific root


cause is known for a group of problems/incidents or a single major
problem. The know error record will exactly define what has gone
wrong and the solution so that it does not happen again. Continuing
our example, a known error would be: There is a fault with the
network card in the printer in department X that is causing the printing
problems. A Known Error is more defined than a Problem.
ITIL Definition: The unknown underlying cause of one or more
incidents. More specifically A condition identified as a result of
multiple incidents that exhibit common symptoms, or of a single
significant incident indicative of a single error.
Further Description: A problem is a more specific definition of
something that has gone wrong. Quite often a number of similar
incidents are linked to a common problem. In the case where a
number of clients are not able to print a problem will be defined
saying something like there is a problem with the network in
department X causing printing problems. A Problem is more defined
than an Incident.
The process that has as primary focus to minimise the adverse impact
of Incidents and Problems on the business that are caused by errors
within the IT Infrastructure, and to prevent recurrence of Incidents
related to these errors

Process

A connected series of actions, activities or operations performed with


the intent of satisfying a purpose to achieve a goal.

Release Management

The process that has as primary focus to securely control the physical
and logical storage, management, distribution and implementation of
all software assets, ensuring that only currently authorised and quality
checked versions of software, are actually brought into use in the
production environment at minimal cost

37

Griffith University
Incident Management Process Handbook

Term

Definition

Request For Change


(RFC)

A form or screen, used to record details of a request for a change to


any component of an IT infrastructure.

Service Catalogue

Written statement of services, default service levels and options.

Service Desk

Information Services organisational unit that makes its services


accessible to clients. Library and IT Help product service group is this
unit. All other Information Services groups E.g. S3, NCS, CTS
contribute to the Service Desk Incident Management Process. See
Section 3.3 "Service Desk Support.

Service Desk tool

Application which is used to record incidents, RFIs, changes and


problems (i.e. Service-Now.com).

Service Level

The expression of an aspect of a service in definitive and quantifiable


terms.

Service Level
Agreement (SLA)

A formal agreement between the client(s) and the IT service provider


specifying service levels and the terms under which a service or a
package of services is provided to the client.

Service Level
Management

The process of regular communication with the client to find out their
requirements and to offer new services and technologies.

38

Griffith University
Incident Management Process Handbook

Appendix D: Unsupported Functions


NB There are no unsupported functions at present time.

39