Beruflich Dokumente
Kultur Dokumente
Procedure Manual
TeliaSonera
The purpose of this document is to lay down the procedure for Incident
Management.
Document Control
0.1 Change History
Published
/ Version No.
Revised Date
Author
Section / Nature
of Change
18th December
2012
0.1
Santosh
Chaudhari
Initial Draft
21st December
2012
0.2
Santosh
Chaudhari
Output of first
workshop
Role
Representing
Kerstin Wennberg
Business Solution
Manager
Telia Sonera
Annelie Hasselblad
Business Solution
Specialist
Telia Sonera
Hakan Pohjanen
Business Solution
Specialist
Telia Sonera
Amit Pai
Service Manager
Capgemini India
Bertil Vilhelmsson
Capgemini Sweden
Niklas Drewitz
Incident Manager
Capgemini Sweden
Anders Linden
Production Planner
Capgemini Sweden
Boris Ristov
Capgemini Sweden
Hiten Vira
Capgemini India
Role
Representing
Kerstin Wennberg
Business Solution
Manager
Telia Sonera
Annelie Hasselblad
Business Solution
Specialist
Telia Sonera
Hakan Pohjanen
Business Solution
Specialist
Telia Sonera
Amit Pai
Service Manager
Capgemini India
Table of contents
1
References / Definitions..................................................................................................................... 1
Policies................................................................................................................................................ 2
Process Workflow.............................................................................................................................. 3
Procedure............................................................................................................................................ 6
10
Guidelines......................................................................................................................................... 14
Proposal Title |
2 References / Definitions
The references and definitions used in this process manual are listed as below
Abbreviation
Description
AM
Application Management
CI
Configuration Item
IM
Incident Management
ITIL
KEDB
KPI
KDB
Knowledge Database
OLA
A,B,C,D
SLA
Terms
Definitions
Incident
Incident means
(a) any event that is not part of the standard operation
of a service and that causes, or may cause, an
interruption to or a reduction in the quality of that
service.
Major Incident
Problem
Incident Record
Impact
Proposal Title |
Terms
Priority
Definitions
A calculated value based upon Impact and Urgency.
This value can be used by the HAR AM team, to
establish the order in which incident tickets should be
considered and to resolve scheduling conflicts, as well
as to manage workload.
Known Error
Work Around
Proposal Title |
3 Policies
Policy #
Policy Statement
3.1
3.1
New: The initial state of an incident in which the concerned person records
the incidents, feeds in the information and generates a unique incident
number. The stage is automated and no manual intervention is required.
Dormant: The status stops the SLA clock from ticking, for the following
reasons 1) Used prior to assigning an incident back to the Telia System
Support, in case more information is required from the user to
resolve the incident.
2) In case HAR AM team would like a third party or an internal
support group within Telia to extend support during the
resolution process. HAR AM team dials the Telia System
Support, provides the Incident/ Request number, and asks for
support in transferring the record to the appropriate resolver
group. The status of the ticket is changed from work in
progress to Dormant.
3) In case a change is required for an Incident to be resolved,
the status of the Incident will reflect as Dormant, from the
time the change is logged - till the time taken for the Change
approvals, to the next available deployment window.
Resolved: The status must be only used by a resolver group post validation
of resolution, after applying the fix. Resolved status does not mean the
closure of the Incident; the Originator receives a communication when the
status is modified to Resolved and is further contacted by the Telia System
Support team for confirmation. In case the Originator denies resolution, the
Incident is assigned back to the HAR AM team.
Closed: Incident is termed as closed once the originator is satisfied with the
resolution and provides a formal communication to Telia System Support
team for closure of the incident.
Proposal Title |
4 Process Workflow
Process Workflow Description
Proposal Title |
Proposal Title |
Proposal Title |
Organization
Incident
Initiator
Telia
Users/Customer
Operation
team/Capgemini/
3rd Party
Responsibility
Incident Management process gets initiated by the Incident
Initiator or through Event Management
Telia System
Support team
Telia
Incident
Manager
Capgemini
Proposal Title |
Role
Organization
Responsibility
HAR AM
team
Capgemini
3rd Party
Internal to
Telia/Support
Vendors/Contrac
tor
frame
Proposal Title |
6 Procedure
Standard Incident Management Lifecycle is basically broken into five stages:
Incident Closure
Each stage is denoted as an activity and comprises of detailed tasks. These activities and associated
tasks are outlined below.
Tasks
Description of
Incident
Responsible
Output
Customer
Operation
team/ Telia
Users
Customer
Operation
team/ Telia
Users
Updated incident
record in case the
originator is not
satisfied with earlier
resolution
Update in case the
Telia System Support,
Application
Maintenance team
provides inputs on an
existing incident.
Customer
Operation
team/ Telia
Users
Telia System
Support
Status of Incident=New
Description of
Incident
Update an Incident
record
Proposal Title | 10
Telia System
Support (L1
Support)
Prioritized and
categorized incident
in the Remedy tool.
Telia System
Support (L1
Support)
Resolution of the
incident by Telia
System Support
based on selfanalysis.
Status of Incident=WIP
Prioritized
and
categorized
incident in the
Remedy Tool
Reference of
incident
resolution in
Knowledge
Database
Status Update of
Incident in Remedy
tool.
Invoked Service
Request process
Invoked Major
Incident Process
Status of Incident=WIP
Updated
Incident
Record
Telia System
Support (L1
Support)
Status update of
incident in the
Remedy tool.
Assign the Incident
ticket to the resolver
group
HAR AM team
(L2 and L3
Support)
Incident
acknowledgement in
the Remedy system
for response SLAs in
accordance with
Proposal Title | 11
guidelines
HAR AM team
(L2 and L3
Support)
HAR AM team
(L2 and L3
Support), Telia
System
Support team
and Originator
Additional information
is gathered as
necessary
HAR AM team
(L2 and L3
Support)
Proposal Title | 12
gathered.
Ticket Status
= In
Progress
HAR AM team
(L2 and L3
Support)
3rd Party
Workaround/Solution
identified.
Trigger for 3rd party
support for resolution
of incident, as
needed.
Proposal Title | 13
Workaround /
solution of the
incident
identified and
updated in
system
HAR AM team
(L2 and L3
Support), Telia
System
Support
Workaround / solution
are applied properly.
HAR AM team
(L2 and L3
Support), Telia
System
Support team,
Originator
Resolution of the
incident
communicated to
Originator for
acceptance.
Status of Incident=Resolved
Activity IM 5: Closure
Resolution of
the incident
provided to
Telia System
Support for
communicatio
n to
Originator
Acceptance of ticket
solution
communicated to
Telia System Support
team and HAR AM
team
Incident update in the
system as Closed
based on acceptance.
Ticket is reassigned
to HAR AM team if
the Originator is not
satisfied with the
solution provided.
Telia System
Support team
Updates in
Knowledge Database
with resolution of
incidents
HAR AM team
(L2 and L3
Support)
Updates in
Knowledge Database
with resolution of
incidents
Proposal Title | 14
Proposal Title | 15
Box
Ref
No
Description
Customer
Operation
s
team/Telia
Users
IM 1.4
R/A
IM 2.1
R/A
IM 2.2
Incident Diagnosis
C/I
R/A
C/I
IM 2.3
R/A
IM 2.4
A/C
IM 3.1
C/I
R/A
IM 3.2
C/I
IM 3.3
IM 3.4
C/I
IM 3.5
R/A
IM 4.1
C/I
R/A
IM 4.2
C/I
R/A
Telia System
Support (L1
Support)
HAR AM team
(L2 and L3
support)
3rd Party
Proposal Title | 16
Box
Ref
No
Description
Customer
Operation
s
team/Telia
Users
IM 4.3
C/I
R/A
IM 5.1
R/A
IM 5.2
Incident Closure
C/I
R/A
IM 5.3
IM 5.4
Telia System
Support (L1
Support)
HAR AM team
(L2 and L3
support)
3rd Party
R/A
R/A
Proposal Title | 17
Priority
Level A
Priority
Level B
Priority
Level C
Priority
Level D
Response time
Immediate
4 hours
2 working
days
Analyzed for
possible
further
development
Resolution time
2 hours
8 hours
5 working
days
Status every
60 minutes
2 hours
8.2 Verification
8.3 Templates:
Proposal Title | 18
Availability Management
Availability Management will use Incident Management data to determine the availability of IT
services and look at where the incident lifecycle can be improved.
Impact definitions
Change Management
Where a change is required to implement a workaround or resolution, this will need to be logged as
an RFC and progressed through Change Management. In turn, Incident Management is able to
detect and Resolve Incidents that arise from failed changes.
Configuration Management
Configuration Management provides the data used to identify and progress Incidents. One of the
uses of the Configuration Management System (CMS) is to identify faulty equipment and to assess
the impact of an incident. It is also used to identify the Customers affected by potential problems. The
CMS also contains information about which categories of incident should be assigned to which
support group. In turn, Incident Management can maintain the status of faulty CIs. It can also assist
Configuration Management to audit the infrastructure when working to resolve an incident.
Knowledge Management
Data held in Knowledge Management repository should be accessed when analyzing and diagnosing
Incidents. Details of known errors and their workarounds should be documented in here to enable
quicker resolution of Incidents either by Telia System Support or HAR AM team.
Problem Management
For recurring incidents and major incidents Problem management is necessary to be carried out.
Incidents are often caused by underlying problems, which must be solved to prevent the incident from
recurring.
Event Management
Proposal Title | 19
Event Management is the process that (automatically) monitors all events that occur through the IT
infrastructure. Some events will raise an Incident; Incident Management will concentrate on restoring
the service as quickly as Current.
Proposal Title | 20
10 Guidelines
Appendix A Incident Classification
Tickets are analyzed by the Telia System Support (L1 Support team) and by HAR AM team (L2 and L3
support). The classification of an incident depends on impact and urgency. The priority levels contain the
Levels A (Critical Incident), B (high), C (medium) and D (low). Incidents identified as Major or highly
impacting are classified as A Incident and thus get handled by the Major Incident Management processes.
The incident gets analyzed regarding Impact and urgency.
Impact is defined under ITIL as a measure of the business criticality of an Incident, often equal to the
extent of a distortion of agreed or expected service levels. As such, it can be assessed based on the
effect of an Incident on the Clients business operations. An Impact may be assessed by taking into
account the number and business roles of the people affected or the business functions supported by the
systems affected.
Urgency is defined under ITIL as a measure of the business criticality of an Incident based on the
impact and on the business needs of the customer. As such, it can be assessed based on how quickly the
business of the Client will be affected by the loss of Service resulting from the Incident. A high-impact
Incident does not necessarily have an immediate Impact. For example, a Telia System Supporting end-ofmonth processing (impact high) can be assessed as urgency low if it occurs early in the monthly
processing cycle, but may be assessed as high if it nears the end of the cycle. A system that supports
the staffs dealing directly with the Clients customers or that supports online, real-time transactions may
always be assessed as a high urgency, even if it is only of moderate impact.
Priority A: An Incident will be assigned as Priority Level A, if the Incident is characterized by at least
one of the following:
(i)
(ii)
(iii)
(iv)
(v)
The Incident is one that has critical or significant impact through single or multiple or part IT
system failures.
The problems cause a complete loss of service or constitute a serious obstacle to essential
parts of the Telia business and the Telia customers that are affected by the Object.
The Incident is one that has a high impact on the operation of the affected Application or
other Service and that cannot be circumvented
The Incident, because of the immediacy of its effect on critical business functions, requires a
Change be made on an immediate-response basis.
Class A incidents include e.g. corrupt data, a critical function not being accessible, the system
hanging in an unidentifiable way, causing unacceptable or impossible delays for internal
resources in the Object or delays to internal answers, system crashes or repeated system
crashes during restart attempts.
Response time: Once a Priority Level A is acknowledged, it is required that the Support group starts work
immediately on fixing it and that normal Service is restored as soon as possible and within the shortest
Service Level of the affected Services.
Priority B: An Incident will be assigned as Priority Level B, if the Incident does not qualify for Priority
Level A but is characterized by at least one of the following:
(i)
(ii)
(iii)
The Incident can materially affect the Client, causing a substantial impact.
No acceptable Work Around is available, but operation can take place to a limited extent in
the Object.
The Incident is one that has an impact where Services are highly degraded to Telia Users at
one or more primary Client locations.
Proposal Title | 21
Response time: Once a Priority Level B is acknowledged, it is required that the Support group starts work
as soon as possible on fixing it (without adversely affecting the resolution of any Priority Level A Incidents)
and that normal service is restored as soon as possible and within the shortest Service Level of the
affected Services.
Priority C: An Incident will be assigned as Priority Level C, if the Incident does not qualify for priority
Level A or B but is characterized by the following:
(i)
(ii)
(iii)
The Incident does not materially affect the Client or does not cause a substantial impact,
but has the potential to do so if not resolved expeditiously.
The effect of the Incident is such that it does not require an immediate response
The Incident is one that has an impact where services are degraded to Telia Users at a
single non-primary Client location.
Response time: Once a Priority Level C is acknowledged, it is required that the Support group starts work
as soon as possible on fixing it (without adversely affecting the resolution of any Priority Level A or Priority
Level B Incidents) and that normal Service is restored as soon as possible and within the shortest Service
Level of the affected Services.
Priority D: An Incident will be assigned as Priority Level D, if the Incident does not qualify for Priority
Level A, B or C but is characterized by any of the following:
(i)
(ii)
(iii)
(iv)
The Incident does not have an adverse impact on the business operations of Telia
because of either the nature of the fault or the small extent of the fault and an acceptable
work around is in place.
The effect of the Incident is such that it does not require immediate resolution.
The Incident is one that does not require immediate attention and no business critical
Services are degraded or failed.
The problems cause no loss in the operation of the Object and comprise minor incidents,
incorrect behaviour or are not included in the documentation/operations manual for the
Object.
Response time: Once a Priority Level D is acknowledged, it is required that the Support group schedules
the remediation work (without adversely affecting the resolution of any Priority Level A, Priority Level B or
Priority Level C Incidents) such that normal Service is restored within the shortest Service Level of the
affected Services.
Appendix C Escalation
Escalation is the mechanism that assists timely resolution of an Incident. It can take place during every
activity in the resolution process. Escalation leads to the necessary management attention. The
management will decide about additional measures, which will assist the resolution process or start
interim solutions. The Incident Manager (IM) is the central point of escalation, wherein the escalation path
is local Incident ManagerService Manager Telia SPOC.
To be escalated are incidents which were not resolved in the time frames appropriate to the priority Level
of the Incident and the priority of the Telia User. The escalation procedures reflect and describe the
Incident, including:
Proposal Title | 22
Appendix E Reporting
Proposal Title | 23