Sie sind auf Seite 1von 336

SGSN-MME

Troubleshooting
Introduction

Main Learning Objectives

Explain the architecture of the SGSN-MME


List and interpret the SGSN-MME logs and the related log files
Understand and solve Interface Faults
Know how to trace subscribers with the tools provided by
SGSN-MME
Identify Mobility and Session Management Faults for SGSNMME (G & W)
Identify Mobility and Session Management Faults for SGSNMME (L)
Understand the built-in Toolbox useful for troubleshooting
List and interpret the different restart levels
Explain the fault handling and CSR escalation

SGSN-MME
Troubleshooting
SGSN-MME Architecture

CONTENTS
Architecture
2010B (G/W/L) node layout
PIU roles
Subsystem structure of the SGSN-MME 2010B
Software Devices functions
Internal Traffic Flow

Hardware Families
MkIV
MkIV
Solaris/VXWorks
Solaris/VXWorks
All
Allboards
boards
are
arev3
v3

MkV
MkV
Linux/VXWorks
Linux/VXWorks

MkVI
MkVI
Linux/VXWorks
Linux/VXWorks
All
boards
All
boards
are
are v4
v4
MkVI+
MkVI+
Linux/VXWorks
Linux/VXWorks
All
v4
Allboards
boards
v4and
and
PEBv5
PEBv5

GPBs are Solaris, IBxx are VXWorks

IBENv4, FSBv4, Some IBASv4 and


the rest are v3
APs are Linux, DPs are VXWorks
All boards are v4
APs are Linux, DPs are VXWorks
New in 2010B
All boards are v4 Except PEBv5
APs are Linux, DPs are VXWorks
New IBACv4 introduced

SGSN-MME Plug-in Units (PIUs)


PIU

MkVI+ MkVI MkV MkIV

Power and Ethernet Board PEB

v5

v4

v3

v3

General Processing Board GPB

v3

File Server Board FSB

v4

v4

v4

v3

Interface board Ethernet - IBEN

v4

v4

v4

Interface board E1/T1 - IBTE

v4

v4

v3

v3

Interface board Narrowband SS7 - IBS7

v4

v4

v3

v3

Interface board ATM single Mode Fibre-optic


IBAS

v4

v3

v3

Interface Board for ATM with Ethernet Media

v4

Converter IBAC

PIUs Roles - APs


Application Processor (AP)
Used for node management, processing, and signaling.
Also referred to as Appl-C.

Node Controller Board (NCB)


Provides central support and functions, such as O&M, Hardware
and Software monitoring, etc
Also referred to as AP/C.
Active NCB & passive NCB.
NCB PIUs are GPB cards in MkIV hardware,
NCB PIUs are IBEN cards in MkV/VI/VI+ hardware

File Server Board (FSB)


Provides disk storage and boot services in MkV/VI/VI+ hardware.
Primary FSB & Secondary FSB.

PIUs Roles DPs


Device Processor (DP)

Handles payload processing and SS7 signaling.


Also referred to as Appl-U.

IP Router

Routes IP signalling and user-plane traffic between external


interfaces and processing cards within the SGSN-MME.
This processor falls into the DP realm.
Router PIUs are IBAS cards.

SS7 Front End (SS7 FE)

Represents the low-level protocols of the SS7 stack, distributing


incoming traffic to the SS7 back ends or Network Management
Module (NMM).
This processor falls into the DP realm.
Narrowband SS7 FE PIU is IBS7 cards.
Broadband SS7 FE PIU is IBAS cards.

SGSN-MME 2010B (WG) Dual Access


(DA) MkIV Hardware

SGSN-MME Triple Access (L/W/G) on


MkVI Hardware

SGSN-MME (LTE) on MkVI+ Hardware

SGSN-MME logical architecture


SGSN-MME

SGSN- MME Software structure


SGSN -MME
SGSN-MME
GPRS application

GPRS application

GSM

WCDMA
SGSN-MME application component
Common application component

Wireless Packet Platform (WPP)


DPE - Distributed Processing Environment
Wireless Packet Platform (WPP)

Middleware

SPARC Processors / Power PC Processors

LTE

SGSN-MME Subsystems
SGSN-MME G
MPS

SGSN-MME W

MTS

SGSN-MME L
EPS

UPS

EMM, ESM,
NAS, S1AP

Business Specific
COS

MVS

GTS

SIS

MME-specific
S6a appl.,
GTPv2
GW selection
config or
dynamic data,
e.g. eNodeB and
TA handling

NCS

SSS

MSS

CHS

CAS

Capella

SDS

GSS

LIS

SCTP device,
DIAMETER

WPP
SS7

CPS

OMS

CIS

Under each
Subsystem is
noted the major
addditions for
MME

Logical structure of the SGSN-MME


SGSN-MME SGSN-MME SGSN-MME
LTE
WCDMA
GSM

GPRS
Applications

Business

Cappella

OMS

Routing

SS7

OTP
Database

WPP

ORB

Filter

Middleware

Link
(ATM, FR, Eth)

Web
server

DPE

Solaris
Sparc

Linux

VxWorks

PowerPC

Switch

Processing
and switching
platform

Distributed Processor Environment (DPE)


PEB
v4

PEB
v4

GPBv3

GPBv3

IBxxv4

IBxxv4

FSBv4

PPC

PPC

PPC

PPC

PPC

LINUX

LINUX

VxWorks

VxWorks

LINUX

Appl.

Appl.

Appl.

Appl.

DPE Distributed Processing Environment

Appl.

Software Devices
A Software Device is a logical representation
of a protocol stack or parts of a protocol stack
Different Device types are available
One Device normally handles several
connections, an example is the GTU device
which processes the GTP payload.

Device Types
Devices Common to GSM and WCDMA:
GTU: handles the GTP layer of the Iu/GN interface. A GTU device
handles several individual subscribers.
Charging: forwards CDRs collected from the GTU device on to the
active NCB for storing.
SS7: A traffic forwarding device which keeps an association
between established SCCP/TCAP dialogs and individual subscribers.
LI: A traffic forwarding device that provides payload to Lawful
Intercept functions.

GSM Only Devices


FR: handles the Frame Relay part of the Gb stack. One FR device
handles multiple FR PVCs
BVC: handles the NS and BSSGP part of the Gb Stack. One BVC
device handles multiple NSEs and BVCs.
MS: Handles the LLC level of the Gb stack. One MS device handles
multiple connections/subscribers

Relationship between Devices and


Protocols in GSM
WPP
Packet Queue
SNDCP
LLC

SS7
GTP

FR
TCAP

BSSGP

UDP

SCCP

NS

IP

MTP-3

FR
E1
Gb - Interface

ETH/ATM
Gn - Interface

Devices

BVC
MS

MTP-2

GTU

MTP-1/E1

SS7

Gr/Gd - Interfaces

Relationship between Devices and


Protocols in WCDMA
SS7

SS7

TCAP
Packet Queue

SCCP

SCCP

MTP3-B

MTP3-B

Devices

SSCF

SSCF

GTU

GTP

GTP

UDP

UDP

SSCOP

SSCOP

AH/ESP

AAL5-CPCS

AAL5-CPCS

IP

AAL5-SAR

AAL5-SAR

AAL5

L2

ATM

ATM

ATM

L1

L1

L1

IuU - Interface

Gn - Interface

Gr/Gd - Interface

IuC - Interface

IP

WPP

SS7

Gb Interface Internal Handling (Gb/FR)

LLC

LLC

BSSGP

BSSGP

NS

NS

FR

BSC

Payload

SGSN-MME

Payload

IBTE

E1
1) Incoming message

2) Remove low-layer
stack, forward to
BVC device.

BVC Device

FR

E1

Device

3) Remove
NS&BSSGP layer,
then forward to MS
Device through
internal backplane

Could be
IBTE/IBEN
/IBAS PIU

Payload
LLC

FR Device can only handled by IBTE


PIU. Though the higher layers maybe
processed by another IBxx. Therefore
there could be two boards involved.

MS Device

Internal backplane

Gr Interface Internal Handling (SIGTRAN)


MAP
MAP

TCAP

SGSN-MME

SCCP
Dst IP: SGSN CN-SS7-1
Service IP

SCCP
M3UA
M3UA

SCTP

SCTP

IP

Router PIU

ETH

HLR

TCAP

PHY
1) Incoming message

2) Remove low-layer
stack, forward to
SCTP FE based on
Dst IP through
internal routing table

Ethernet
4) Remove M3UA/SCCP/TCAP
layer, then forward to AP through
internal backplane

AP

SCTP FE

MAP
3) Remove SCTP
layer, then forward to
SS7 BE through
internal backplane

MAP

SCCP
M3UA

SS7 BE

IBXX V4
PIU

TCAP

Internal backplane

Gn-C Interface Internal Handling


TEID: Tunnel Endpoint Identifier
TEID identifies a GTP endpoint.
Control Plane TEID
used by GGSN

SGSN-MME

GTP-C
Dst IP: SGSN GTPC Service IP

UDP
IP

STM Port
Ethernet

GTP-C

IBAS

GGSN
1) Incoming message

Ethernet

2) Remove low-layer
stack, forward to AP
based on dst IP
through internal
routing table

GNR

Internal backplane

AP

Gn-U Interface Internal Handling


Data Plane TEID
used by GGSN
Dst IP: GGSN
GTP-C IP
1) Packets coming from Gb or Iu-U
interfaces will be inserted into the
correct GTP-U tunnel for that
particular sunscriber and then
forwarded out on the Gn network via
teh Gn router.

Application

SGSN-MME

GTP-U
UDP
IP

GGSN
Ethernet

Gn Router PIU
2) Sent to GGSN via Gn
Router PIU

GTU Device

The GTU device could be


running on IBTE/AS/EN board.
Internal backplane

Example on TEID routing


1.8
TEID (16#C0321407)

DP Index (420)

DP (1.8)

1.9
Incoming GTP-U packet

IP UDP GTP
TEID

TEID in GTP header will be used for forwarding

Router

DP index will be calculated from TEID

1.10

DP index will be used as key when finding DP


Packet will be forwarded to DP

DP Index to DP mapping
mag:slot
Index ..0 ..1 ..2 ..3 ..4 ..5 ..6 ..7 ..8 ..9
0-9 1:8 1:8 1:8 1:8 1:8 1:8 1:8 1:8 1:9 1:9
10-19 1:9 1:9 1:10 1:9 1:9 1:10 1:9 1:9 1:9 1:9
20-29 1:9 1:10 1:9 1:9 1:10 1:9 1:10 1:9 1:9 1:9
30-39 1:9 1:9 1:8 1:11 1:10 1:9 1:10 1:10 1:9 1:9
40-49 1:9 1:10 1:9 1:10 1:9 1:10 1:10 1:12 1:9 1:9
*
420-429 1:8 1:10 1:11 1:9 1:8 1:8 1:11 1:10 1:9 1:8

1.11

1.12

DPs

SGSN-MME
Troubleshooting
Log Files

CONTENTS

SGSN-MME logs and the related log files


Built-in and System logs
Content of the log files
Health Check

SGSN-MME Logs
The SGSN-MME provides a logging function for collecting
data in files.
Two types of logs
Built-in logs
Created and maintained by the WPP logging function.
These logs can be administrated by CLI.
Used for alarms, events, charging, PM and other informational
logs.
System logs
Created by SGSN-MME software rather than platform-level
software.
These logs are administered by VxWorks and Unix commands.
Used to collect SGSN-MME internal messages for fault finding
and troubleshooting.

Built-in Logs
The built-in logs are managed as circular logs which consists of
several log files.
The log files are stored in /tmp/OMS_LOGS with the exception of
CDR files which are stored in /tmp/OMS_Charging
The log files are shared/distributed to the GPBs by the active
NCB.
The built-in logs can be configured with the Packet Exchange
Manager (PXM) and the Command Line Interface (CLI) if needed.
The contents of the log files can be analyzed either with PXM
(Log Viewer) or with Unix commands like cat, more, pg , etc.
The CLI command gsh list_logs lists all available built-in logs.

Built-in Log Directory Structure


/tmp

OMS_LOGS

fm_alarm
tmp

OMS_CHARGING

er_data_log

ready

tmp

ready

fm_event
tmp
file: fm_event.15
file: fm_event.index

ready
file: fm_event.1
file: fm_event.2

file: fm_event.14

chsGtpPrimeLog
tmp

ready

chsLog
tmp

ready

Built-in Logs (1/2)


Below are the SGSN-MME built-in logs to store all
important data and actions:
ADC: Automatic Device Configuration (ADC) log
AdmissionControlUsage: Events related to features and
capacity licenses
au_data_log: Failed MS authentications
chsLog: CDRs are collected in chsLog
chsGtpPrimeLog: In near-real-time charging, the CDRs are
grouped into GTP' PDUs
ebm: Event-Based Monitoring enables SGSN-MME to log
successful and unsuccessful events, formatted according to
the event types.

Built-in Logs (2/2)


er_data_log: Traffic event recording
fm_alarm: All occurred alarms and alarm clearings
fm_event: All occurred events are stored
Gf_IMEIcheck_log: All IMEI_CHECK failures
Gs_interface_log: Mobile status messages sent over the Gs
interface, for indicating errors
list_subscribers_result: Subscribers registered in the GSN
mmi_log: All activities on the machine-to-machine interface
mobility_event_log: All Attach Reject messages due to network
failure
OMS_SM_Log: Each action performed by the operator
Performance monitoring logs
session_event_log: All MS-initiated Activate PDP Context rejects
UE Tracer Log Information on signaling messages for UE

Charging Logs
The charging files are stored on the separate
partition /tmp/OMS_CHARGING.
The CLI command gsh list_chs_logs lists all
available charging logs.
The SGSN-MME R2010B has 2 charging logs:
chsLog which contains the Charging Data Records (CDRs) for
postpaid charging
chsGtpPrimeLog which contains CDRs for near-real-time
charging that couldnt be transferred to the external charging
system due to connection failures

System Logs
All system logs are stored in /tmp/DPE_COMMONLOG/.. on the
active NCB and are shared/distributed to the PNCB by
High Availability Network File System (HA-NFS)
The logs are organized in the following way:
General system logs are stored directly in
/tmp/DPE_COMMONLOG/..
Board specific system logs are stored in separate directories for
each control board.
Old system logs are stored in
/tmp/DPE_COMMONLOG/../LogBackup

Please note, all paths are given for the active NCB!
It is assumed that the active NCB is located on the PIU
eqm01s14p2 and the passive NCB is located at
eqm01s13p2

Important System Logs


The following general system logs are available in
/tmp/DPE_COMMONLOG :
isp.log which contains all EC loadings, small restarts, large
restarts and node restart events since the initial installation.

The following active NCB system logs are available in


/tmp/DPE_LOG :
stcompl.log contains small restart, large restart and node restart
complete messages
ss7trace.log contains startup and error messages of SS7 stack

Unix Log Files


An SGSN-MME AP is in principle nothing but a Unix
workstation, which executes special programs.
A GPB is a Sparc processor running SUN Solaris.
An IBEN card is a power PC running Linux.

The Unix operating system contains a logging function


for Unix specific events
The syslog daemon writes kernel, error and other
messages to the log file /var/adm/messages

Alex Documentation for Logs


Alex contains documents describing how to interpret
the following log files:
Built-in Logs:
fm_alarm: Alarm logs
fm_event: Event logs
mobility_event_log: Attach Reject logs
session_event_log: PDP Context Reject logs
ADC: Automatic Device Configuration logs
er_data_log: IMSI Event Recording logs
chsLog: CDR logs
System Logs:
isp.log: In Service Performance log

Combined Log Directory Structure


Built-in Logs and System Logs

/tmp

DPE_LOG

DPE_COMMONLOG

ss7trace.log

OMS_LOGS

isp.log
NodeDump directory
LogBackup directory

OMS_CHARGING

er_data_log

chsLog

fm_alarm

fm_event

tmp
file: fm_event.15
file: fm_event.index

tmp
tmp

ready

ready

tmp

ready
file: fm_event.1
file: fm_event.2

file: fm_event.14

chsGtpPrimeLog
tmp

ready

ready

Unix tail Command for Log Files


The Unix command tail <filename> shows the end of a unix
file. (the tail end, that is)
Good for files that are very long with most interesting info at
the bottom of the file - like log files.
The user can specify how many lines at the end of the file to
display. For example, to display the last 500 lines of the isp
log file, use the following command:
tail -500 /tmp/DPE_COMMONLOG/isp.log

The tail command can be used to display information as it is


being written to the end of a file. (Provides a scrolling display
of logs as they are being written.)
tail f <filename>

What to look for in isp.log?


The log file /tmp/DPE_COMMONLOG/isp.log is the most
important log file for troubleshooting
This log file gives an overview of the previous and current
status of a SGSN-MME
All important events are logged in isp.log:
Processing Module (PM) reboots
DP takeovers
AP Takeovers
Small Restarts
Large Restarts
Node Restarts
Number of attached subscribers

Example Contents of isp.log - Large


Restart
2006-09-05
2006-09-05
2006-09-05
2006-09-05
2006-09-05
2006-09-05

08:12:56;sau;;963700,heartbeat;cxs10127_2r12k08(7-00-00)
08:12:56;pdp;;578200,heartbeat;cxs10127_2r12k08(7-00-00)
09:14:13;large_restart;ncs;manual;cxs10127_2r12k08(7-00-00)
09:14:15;sau;;964600,event;cxs10127_2r12k08(7-00-00)
09:14:15;pdp;;579000,event;cxs10127_2r12k08(7-00-00)
09:15:03;StartUpAfter_large_restart;;;cxs10127_2r12k08(7-00-00)

2006-09-05 09:15:03;features;;
[mplmn,qosHsdpa,eqPlmns,li,edge,maxScaleUp=8,sgsnPool,imeiCheck,ciphering
,gbIp,pfc,qos,rimTr,v42,vplmn_allocation,aace,srns,sau=1000000,pdp=150
0000,ipsec,qosConv,dual,camel,nrr,qosStream,ipv6,pdp_home,secPdp,gs,securi
ty_function,pdp_visit,qosImsi,dtm,hComp,sms,gtpP,prioPay,adc];cxs10127_2r
12k08(7-00-00)
2006-09-05 09:16:04;aborted_connections;;964600;cxs10127_2r12k08(7-00-00)
2006-09-05 09:16:04;lost_contexts;;579000;cxs10127_2r12k08(7-00-00)
2006-09-05 09:16:04;sau;;20200,event;cxs10127_2r12k08(7-00-00)
2006-09-05 09:16:04;pdp;;9200,event;cxs10127_2r12k08(7-00-00)
2006-09-05 09:26:04;sau;;34525,ramp_up;cxs10127_2r12k08(7-00-00)
2006-09-05 09:26:04;pdp;;15247,ramp_up;cxs10127_2r12k08(7-00-00)

Example Contents of isp.log - AP Take Over


Loss of AP
2007-08-16 14:39:18 UTC+0200;pm_failure;fed_check;1.8.2.1;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:18 UTC+0200;hw_lost;ncl;1.8;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:20 UTC+0200;sau;;0,event;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:20 UTC+0200;pdp;;0,event;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:20 UTC+0200;AP_take_over;ncs;auto,1.8,loss;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:20
UTC+0200;AP_take_over_OK;ncs;auto,1.8,loss,first_index;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:23
UTC+0200;AP_take_over_OK;ncs;auto,1.8,loss,all_indices;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:27 UTC+0200;AP_take_over_OK;ncs;auto,1.8,loss,all
replicas;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:39:27
UTC+0200;AP_take_over_OK;ncs;auto,1.8,loss,load_balance;CXS10127/4_R20C15(8-00-00)

Gain of AP
2007-08-16
2007-08-16
2007-08-16
2007-08-16
2007-08-16
2007-08-16
2007-08-16
00-00)
2007-08-16
00-00)
2007-08-16
00-00)

14:40:04
14:40:04
14:40:04
14:40:10
14:40:17
14:40:17
14:40:17

UTC+0200;pm_detected;ncl;1.8.2.1;CXS10127/4_R20C15(8-00-00)
UTC+0200;clear_of_hw_alarm;ncl;1.8;CXS10127/4_R20C15(8-00-00)
UTC+0200;hw_detected;ncl;1.8,IBxxv4;CXS10127/4_R20C15(8-00-00)
UTC+0200;AP_start;ncs;1.8;CXS10127/4_R20C15(8-00-00)
UTC+0200;StartUpAfter_AP_start;;1.8;CXS10127/4_R20C15(8-00-00)
UTC+0200;AP_take_over;ncs;1.8,gain;CXS10127/4_R20C15(8-00-00)
UTC+0200;AP_take_over_OK;ncs;1.8,gain,first_index;CXS10127/4_R20C15(8-

14:40:20 UTC+0200;AP_take_over_OK;ncs;1.8,gain,all_indices;CXS10127/4_R20C15(814:40:23 UTC+0200;AP_take_over_OK;ncs;1.8,gain,load_balance;CXS10127/4_R20C15(8-

Example Contents of isp.log - DP Take


Over
Loss of DP
2007-08-16
2007-08-16
2007-08-16
2007-08-16

14:20:29
14:20:29
14:20:30
14:20:32

UTC+0200;pm_failure;ncl;2.8.2.1;CXS10127/4_R20C15(8-00-00)
UTC+0200;hw_lost;ncl;2.8;CXS10127/4_R20C15(8-00-00)
UTC+0200;DP_take_over;ncs;auto,2.8,loss;CXS10127/4_R20C15(8-00-00)
UTC+0200;DP_take_over_OK;ncs;auto,2.8,loss;CXS10127/4_R20C15(8-00-00)

Gain of DP
2007-08-16
2007-08-16
2007-08-16
2007-08-16
2007-08-16

14:20:53
14:20:53
14:20:53
14:21:18
14:21:19

UTC+0200;pm_detected;ncl;2.8.2.1;CXS10127/4_R20C15(8-00-00)
UTC+0200;clear_of_hw_alarm;ncl;2.8;CXS10127/4_R20C15(8-00-00)
UTC+0200;hw_detected;ncl;2.8,IBTEv3;CXS10127/4_R20C15(8-00-00)
UTC+0200;DP_take_over;ncs;2.8,gain;CXS10127/4_R20C15(8-00-00)
UTC+0200;DP_take_over_OK;ncs;2.8,gain;CXS10127/4_R20C15(8-00-00)

Finding AP and DP Takeovers in isp.log


The node_up tool does not explicitly display AP and DP
takeovers. It lists any restarts associated with the takeovers.
The following unix command provides a quick method to find
information regarding AP and DP takeovers in the isp.log file:
cat /tmp/DPE_COMMONLOG/isp.log | grep take_over

Other search strings that can be helpful in finding information


in the isp.log file are listed below:
pm_restart
small_local_restart
small_restart
large_restart
node_restart
take_over (This string will match for AP or DP takeovers)

node_up Summary of isp.log


The node_up tool displays a summary of restart information
from the isp.log file.
The format of the command is as shown below.
node_up [-h] [-d {all|from_date [to_date]}]
If no parameters are specified, the tool displays ISP summary
information since the last node startup.
Example commands:
node_up -d 2008-07-20 lists all restarts since 7/20/2008
node_up -d all list all restarts recorded in the ISP log file.
node_up is part of the SGSN Toolbox, not a CLI command.
Therefore it does not require gsh to proceed the command.

Example output of node_up d all


=== root@eqm01s14p2 GPB ~ # node_up -d all
2007-08-16 14:10:45 UTC+0200;os_startup;;time_not_synched, eqm01s14p2;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:10:50 UTC+0200;aea;ncl;set AEA;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:11:41 UTC+0200;StartUpAfter_initial_start;ncl;cxp9011380_1r20c15_0_0;CXS10127/4_R20C15(800-00)
2007-08-16 14:11:42 UTC+0200;StartUpAfter_node_restart;;;CXS10127/4_R20C15(8-00-00)
2007-08-16 14:12:43 UTC+0200;backup_ncb;ncl;1.19;CXS10127/4_R20C15(8-00-00)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
2007-08-17 08:36:30 UTC+0200;os_startup;;time_not_synched, eqm01s14p2;CXS10127/4_R20C15(8-00-00)
2007-08-17 08:36:36 UTC+0200;aea;ncl;set AEA;CXS10127/4_R20C15(8-00-00)
2007-08-17 08:37:44
UTC+0200;StartUpAfter_initial_start;ncl;cxp9011380_1r20c15_0_0_All;CXS10127/4_R20C15(8-00-00)
2007-08-17 08:37:45 UTC+0200;StartUpAfter_node_restart;;;CXS10127/4_R20C15(8-00-00)
2007-08-17 08:38:40 UTC+0200;backup_ncb;ncl;1.19;CXS10127/4_R20C15(8-00-00)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

What to look for in ss7trace.log?


The SS7 log files contain startup and error messages from
the SS7 stack
Error messages start with ****
The error messages is presented in one line
The last column in the line is the error code, which can be
decoded with SGSN-MME toolbox tool tv_itu. An example
of the command and the result is shown below:
>> /tmp/DPE_SC/LoadUnits/ttx/bin/tv_ansi -e 11095
MTPL3: LINK OUT OF SERVICE
A DL_OOS_ind primitive was received by MTP-L3.

The tv_itu tool can also be used to decode the contents of


the SS7 log file, instead of just a single error code.
Examples are shown in the slides that follow.

Contents of ss7trace.log
SENT: 2009 Feb 18 12:30:13:183
0:11025268
Sender:
MGMT:0
Receiver: OAM:0
Primitive: 20
Size:
12
14, 6, 0, 4, 7, 0,7F, 9, 7, 6, B,AB,
2009 Feb 18 12:30:13:183
**** MTPL3:0 M3LinkMxDL.c

0:11025269
4093
2

RECEIVED: 2009 Feb 18 12:30:13:183


Sender:
MTPL3:0
Receiver: MGMT:0
Primitive: 7
MD Size: 1
2,
Size:
10
7, 4, 7, 0,7F, 9, 7, 6, C,AB,

8
0:11025270

12

11095

Information after using tv_itu


SENT: 2009 Feb 18 12:30:13:183 0:11025268
Sender: MGMT:0
Receiver: OAM:0
Primitive: 20
14, 6, 0, 4, 7, 0,7F, 9, 7, 6, B,AB,

Module ID:
MTP Layer 3
Length Of Alarm Status:
7
Alarm Id:
Link Out of Service (DL_OOS_ind received)
Hardware Selection Number (HSN):
0
Signalling Data Link (SDL):
127
Mtpl2 Error Code:
Spare
**** MTPL3:0 M3LinkMxDL.c 4093

MTPL3: LINK OUT OF SERVICE


A DL_OOS_ind primitive was received by MTP-L3.

12

0 11095

OMS_SM_Log - User Activity


When troubleshooting issues, it can be helpful to
know if the system configuration has been recently
changed.
The built-in log file OMS_SM_Log records
configuration activity performed by users on the
SGSN-MME.
The log file is stored in the following directory:
/tmp/OMS_LOGS/OMS_SM_Log

Example Contents of OMS_SM_Log


Date:2009/07/14, Time:18:27:11, User:sysadm, Role:ConfigRole,
cmObjMI_ObjectManager_impl::deleteInstance, ["ip_service_address",
{sn, "GbIP"},
{ip, "10.42.85.71"}]
Date:2009/07/14, Time:21:45:14, User:sysadm, Role:ConfigRole,
cmObjMI_ObjectManager_impl::modifyInstance, ["ip_interface",
{ifn,
"ETH_2_12_1_101"}]
Date:2009/07/14, Time:21:45:20, User:sysadm, Role:ConfigRole,
cmObjMI_ObjectManager_impl::getInstance, ["ip_interface",
{ifn,
"ETH_2_12_1_101"}]
Date:2009/07/14, Time:21:46:15, User:sysadm, Role:ConfigRole,
cmObjMI_ObjectManager_impl::createInstance, ["inbound_pf_policy",
{ifp,
"ETH_2_11_1_101"}]

Alarms and Events

Levels
Events

Indeterminate

Informative notification
Critical
Major
Alarms

Minor
Warning

Fault indication

Fault Management in SGSN-MME

SGSN-MME

Alarms and Events, Lists and Logs


Alarm List: Currently active alarms in the SGSN-MME
Use CLI command gsh list_alarms to see the currently active
alarms
View in PXM, OSS, or other network manager node

Event list: Latest events


Use CLI command gsh list_events to see recent events
View in PXM, OSS or other network manager node

Alarm Log: Log of current and past Alarms


Look at log file in /tmp/OMS_LOGS/fm_alarm

Event Log: Log of current and past Events


Look at log file in /tmp/OMS_LOGS/fm_event

Contents of fm_alarm
1

dpeHardwareFailure
55131

ethLinkDown
55133

2009-01-10 08:54:57 2.15.2.1-ethBlock-X


major
Ethernet port 2.15.2.1:0 has lost link.

atmLossOfSignal
55134
has lost receive signal

2009-01-10 08:54:57 1.4.2.1-atmBlock-X


major
communications
The SDH/SONET interface for ATM port 1.4.2.1 on Equipment 1

pcmE1T1LossOfSignal
55139

2009-01-10 08:54:57 2.3.2.1-pcmBlock-X major


communications
Interface 2.3.2.1 PCM port 1 has lost receive signal.

10 pcmE1T1LossOfFrame
551310
synchronization

2009-01-10 08:54:57 1.18.2.1


major
equipment
Hardware error on element 1.18.2.1 detected by DPE.
communications

2009-01-10 08:54:57 2.3.2.1-pcmBlock-X


major
communications
E1/T1 Interface 2.3.2.1 PCM port 1 has lost frame

11 ss7Mtpl1LossOfSignal
551411

2009-01-10 08:55:00 ss7MTPL1 1.5


major
Loss of signal is detected on PCM trunk A.

communications

12 ss7Mtpl1LossOfFrame
551412

2009-01-10 08:55:00 ss7MTPL1 1.5


major
Loss of signal is detected on PCM trunk A.

communications

16 ss7Mtpl3LkOutOfServ
551416
service. Status is 20.

2009-01-10 08:55:08 ss7M3 1.5


major
equipment
Signaling link on EqPos 1.5, Trunk A and Timeslot 1 is out of

Contents of fm_event
event; ss7SccpRmtSSNStatChange;
processing;
indeterminate; 2009-01-06
09:53:16; Status change in remote subsystem occurred at SPC 461298. Status of SSN
142 is 2. Affected NodeID is 0, with Local SPC 461183.; {31848240};
'ss7SCCP ';
event; ranRncRestarted;
communications; major;
2009-01-06
10:00:09; RNC Initiated Reset received from RNC=RNC01; {31878248};
ups_SgsnTapp_rancl;
event; gtpGSNrestarted;
communications; indeterminate; 2009-01-06
14:01:34; An updated GTP restart counter is received on the gtpc path =
(eqm01s0dp2)172.20.105.65:34209<->10.0.46.2 (Connection between this node and
external node);
{32858343};
gtpResetIndicationReceived;
event; dpeEquipmentBlocked;
equipment;
minor;
10:08:23; Element 2.12 has been blocked.; {37798546};
'2.12';

2009-01-07

event; dpeReducedCapacity;
equipment;
indeterminate; 2009-01-07
10:08:23; The node has reduced capacity. There are blocked PIUs.; {37798547};
'NCL';
event; nocNodeRestart;
processing;
08:55:06; A Node restart is in progress.; {55141};

critical;
2009-01-10
startUpOngoing;

event; ss7Mtpl3LkInServ;
communications; indeterminate; 2009-01-10
08:55:09; Signaling link on EqPos 1.3, Trunk A and Timeslot 1 is in service.;
{55142};
'ss7M3 1.3';
event; ss7Mtpl3LkInServ;
communications; indeterminate; 2009-01-10
08:55:15; Signaling link on EqPos 1.2, Port 0, VPI 1 and VCI 301 is in service.; {55143};
'ss7M3 1.2';

Alarm Handling
The Alex library contains a document for every
alarm defined on the SGSN-MME.
These documents provide information on possible
causes for the alarm, and resolution actions that
can be used to resolve the fault.
Most alarms will clear automatically when the fault
condition is resolved.
It is possible to manually clear alarms using the
following CLI command:
gsh clear_alarms <fault_id>

Alex Documents for Alarms and Events


SGSN-MME Operation and Maintenance Alarm Handling
SGSN-MME Operation and Maintenance Alarm and Event
Descriptions Alarm Descriptions
This directory contains documents describing each individual
alarm, resolution actions, etc...

SGSN-MME Operation and Maintenance Alarm and Event


Descriptions Event Descriptions
This directory contains documents describing each individual
event, causes and consequences of the event, etc...

What is logged by EBM?


EBM logs successful, unsuccessful, abort and
ignore events for Attach, Activate PDP context,
RAU, ISRAU, Deactivate PDP context, Detach and
Service Request event.(New in 2010B)
The following parameters are logged (depending
on type of event):
EVENT_RESULT, ATTACH_TYPE, RAT, CAUSE_CODE,
SUB_CAUSE_CODE, MCC, MNC, LAC, RAC, CI, SAC,
IMSI, PTMSI, IMEISV, HLR, Transferred_PDP,
Dropped_PDP, APN, GGSN

How is EBM configured?


The modify_ebm_event CLI
command controls which
event types that are logged in
the Event-Based Statistics log.
Usage
modify_ebm_event -en
EventName

The get_ebm_log CLI command


shows which event types that
are logged in the Event-Based
Monitoring log.

Supported Event:
Attach
Activation of PDP context
RAU
ISRAU
Deactivation of PDP context
UE Handover

How does the event logging work?


A new logfile is published once every Report
Period(RP).
In SGSN-MME 2010B, it is possible to configure
the RP. Default is 15 minutes but 1, 5, 15, 30 and
60 minutes are valid values. (CLI
modify_ebm_log).

SGSN-MME Health Check - General


The procedure for performing a Health Check on the SGSNMME is specified in Alex. The lists which follow provide a
summary of the Alex procedures.
Check alarms and events using the following commands:
gsh list_alarms
gsh list_events

Check KPIS

pdc_kpi.pl

Check for software faults, hardware faults and recent restarts


in the ISP log file, at the following location:
/tmp/DPE_COMMONLOG/isp.log

Check interfaces for GSM, WCDMA and LTE. See following


slides

Interface Health Check - GSM


If using Gb over IP, check the status of the remote IP terminal by
running the following command for each NSE defined on the SGSNMME:
gsh get_nse <nsei>

Check the status of all NSVC (connections between SGSN-MME and


BSCs) by running the following command for each NSVC defined:
gsh get_nsvc <nsvci>

Check the status of all BVCs (logical connections between SGSN-MME


and Cells) by running the following command for each BSC defined:
gsh list_bvcs -bsc <bscName>

Check the status of the SS7 signaling links by running the following
command:
gsh action_ss7_sys_statlinks

Check the reachability of remote SS7 signaling points of a remote


SAP by running the following command for each SAP:
gsh action_ss7_sccp_remote_sap_statspc -dpc <dpc> -ssn <ssn>

Interface Health Check - WCDMA


Check status of each RNC by running the command shown below for
each RNC defined on the SGSN-MME. Check that the status is set to
In Service. (Use the command gsh list_rncs to get a list of all
RNCs defined on the SGSN-MME.)
gsh get_rnc <RncName>

Check status of all SS7 signaling links by running the command shown
below. check that the status is set to In Service.
gsh action_ss7_sys_statlinks

Check the reachability of remote SS7 signaling points of a remote


SAP by running the command shown below for each SAP. Check that
the status is set to Allowed.
gsh action_ss7_sccp_remote_sap_statspc -dpc <dpc> -ssn <ssn>

Note that the signaling connection to all RNCs is an SS7-based


interface, so checking the status of SS7 links and SAPs provides
information about connectivity between the SGSN-MME and the RNCs.

Interface Health Check - LTE


Run the following series of commands to request the status of an SCTP
association:

gsh show_sctp_epl -eqp EquipmentPosition


gsh show_sctp_assl -eqp EquipmentPosition -epid SctpEndPointId
gsh show_sctp_assstat -eqp EquipmentPosition -aid AssocId
Check that the Association State is set to Established. Check that the SRTT
value is reasonable. For more information, see the show_sctp_assstat CLI
command.

To view the eNodeB auto-configuration data, run the following command:


gsh show_mme_enodeb
Check that the state of the connections towards the eNodeBs are set to
connected. For more information, see the show_mme_enodeb CLI command.
The SCTP information is only displayed when the eNodeB is connected.

To display all tracking areas supported by the eNodeBs that have been
auto-configured in the MME, run the following command:
gsh show_mme_ta

Alex Documentation for Health Check


For further information regarding the Health Check
procedure, refer to the following Alex document:
SGSN-MME Operation and Maintenance Health Check

Example printout of the node_check


-c command (1/2)
=== root@eqm01s14p2 ANCB log/LogBackup # node_check -c
For a description of all options use /tmp/DPE_SC/LoadUnits/ttx/bin/node_check -h
Checking if node has started completely (via isp.log) ... OK
GSN STATUS
Date
: 2006-09-05 10:23
Node type
: sgsnwg
Node name
: SGSN200
Uptime
: 15:15
Last OS startup
: 2006-09-04 19:09:33
Last node startup
: 2006-09-04 19:15:09
Current Software Configuration
: cxr1010225_4r2a03_pa10
Small local restarts
: 0
Small restarts
: 0
Large restarts
: 0
CM restarts
: 0
PM Reboots
: 0
Number of nodedumps
: 1 (!!!)
Erlang crash dumps in /
: 0
Erlang crash dumps in /tmp/DPE_LOG
: 0
Number of DIED proc in ncl.log
: 0
Number of "CrashHandler" in app.log : 0
Number of NCS crashes since reload
: 0
Number of NCS messages since reload : 3
Timeframe of NCS messages
: 2006-09-04 19:13:29 - 2006-09-05 09:49:58

Example printout of the node_check


-c command (2/2)
Number of
Timeframe
Number of
Timeframe

dyn worker crashes since reload


of dyn worker crashes
dyn worker messages since reload
of dyn worker messages

:
:
:
:

2 (!!!)
2006-09-05
393
2006-09-05

10:13:17 - 2006-09-05

10:23:57

10:13:17 - 2006-09-05

10:24:08

Connectivity check
PEB check
: OK
GPB check
: OK
nodePdcJob does not exist! It must be created with pdc_setup.sh.

SGSN-MMME
Troubleshooting
Interface Faults

Objectives
Upon the completion of this chapter, the student will be able to:

Understand and solve Interface faults

Troubleshoot the SS7/IP/Frame Relay Interfaces

Explain procedures for configuration troubleshooting

Troubleshooting Procedures
This chapter provides an overview of the procedures for troubleshooting
different types of faults on the SGSN-MME.
Additional Information on troubleshooting procedures can be found in the
Alex library at the following location:
SGSN-MME Operation and Maintenance Fault Management
Troubleshooting

The troubleshooting tools presented in the previous chapter will be used


to perform various steps of the troubleshooting procedures.
The first recommended step of any troubleshooting procedure is to
perform the SGSN-MME Health Check as described in the previous
chapter.

SS7-based Interface Problem

Associate Link-Level Alarm to Linkset


Method 1 - Match Status Codes

Below is an alarm for a narrowband link that is out of service.


46 ss7Mtpl3LkOutOfServ
2009-06-11 00:34:55 ss7M3 1.5
major
equipment
9473846 Signaling link on EqPos 1.5, Trunk A and Timeslot 2 is out of service.
Status is 20.

The Troubleshooting guide in Alex provides info on the status codes


reported in SS7 alarms. Status 20 on an IBS7 board is defined as
Signaling link alignment or proving failure

The command gsh action_ss7_sys_statlinks will display the


status of all SS7 links on the system. Example output for a single
link is shown:
=== sysadm@eqm01s14p2 ANCB ~ # gsh action_ss7_sys_statlinks
NodeID
0
OPC
461183
SLC
0
LinksetNo
22
State
Aligning M3 links

The linkset ID is shown in the field LinksetNo

Associate Link-Level Alarm to Linkset


Method 2: Trace SS7 Configuration

Below is an alarm printout for a narrowband signaling link out of service.


ss7Mtpl3LkOutOfServ
2009-06-11 00:34:55 ss7M3 1.5
major
equipment
9473846
Signaling link on EqPos 1.5, Trunk A and Timeslot 2 is out of service.
Status is 20.

To find the linkset with which the OOS link is associated, use this
command which lists all the narrowband links defined on the SGSN-MME:
gsh list_ss7_mtpl3_link_nb

-eqp

\*

-trunk

\*

-ts

\*

Example output is shown below:


=== sysadm@eqm01s14p2 ANCB ~ # gsh list_ss7_mtpl3_link_nb -eqp \* -trunk \* -ts \*
ps Class
Identifiers
| eqp trunk ts
----------------------------------------------------------------------------------A ss7_mtpl3_link_nb
-net net1 -nid 0 -lsid 1
-slc 0
| 1.3 A
1
A ss7_mtpl3_link_nb
-net net1 -nid 0 -lsid 21 -slc 0
| 1.5 B
1
A ss7_mtpl3_link_nb
-net net1 -nid 0 -lsid 22 -slc 0
| 1.5 A
2

From the output shown, find the ss7_mtpl3_link_nb that corresponds to


the eqp, trunk and timeslot from the alarm. The linkset that corresponds
to equipment 1.5, Trunk A, Timeslot 2 is the third one in the list above with lsid 22. The combination of net, nid and lsid is used to get status on
this linkset.

View Linkset Status


To find the status of the linkset, use the command below.
Replace the xs in the command with the info about linkset
retrieved in the previous step.
gsh action_ss7_mtpl3_linkset_status -net xx -nid 0 -lsid x

The status returned will indicate how many links within the
linkset are in service (OK) and the total number of links
defined in the linkset.
Example output is shown below:
# gsh action_ss7_mtpl3_linkset_status -net net1 -nid 0 -lsid 22
NodeID 0
OPC 951
LinksetNo 22
NumberOfLinksInSetOK 0
TotalNumberOfLinksInSet 2

View Routeset Status


First find the destination point code served by the linkset. Use the
following command to list all linksets and their related DPCs. Find the
linkset in the list.
gsh list_ss7_mtpl3_linkset -dpc \*

To get the status of the Routeset to that point code, use the command
below. Replace the xs in the command with info from the previous steps
gsh action_ss7_mtpl3_routeset_rst -net xx -nid 0 -dpc xx

The status returned will indicate if all routes to the destination are out of
service, or only some routes. Example output is shown below:
==sysadm@eqm01s14p2 ~ # gsh action_ss7_mtpl3_routeset_rst -net net1 -nid 0 -dpc 825
NodeID 0
OPC 951
RoutesetNo 825
NumberOfRoutesInSetOK 0
TotalNumberOfRoutesInSet 2

View Status of Remote DPC and SAP


To find the remote SAPs associated with the destination point code, use the
command below. Find all SAPs defined for the remote point code in the list.
gsh list_ss7_sccp_remote_sap

For each remote SAP, get the status of the SGSN-MMEs ability to
communicate with the remote Point Code, and with the remote SAP. Use the
commands below. Replace the xs in the command with info from the previous
steps
gsh action_ss7_sccp_remote_sap_statspc -net xx -nid 0 -dpc xx -ssn x
gsh action_ss7_sccp_remote_sap_statssnspc -net xx -nid 0 -dpc xx -ssn x

Example output is shown below:


# gsh action_ss7_sccp_remote_sap_statspc -net net1 -nid 0 -dpc 825 -ssn 6
NodeID 0
OPC 951
DPC 825
DPC Status Prohibited
CongestionLevel 0
# gsh action_ss7_sccp_remote_sap_statssnspc -net net1 -nid 0 -dpc 825 -ssn 6
NodeID 0
OPC 951
DPC 825
SSN 6
SSN Status Prohibited
CongestionLevel 0

Check the SS7 Logs


Check SS7 Logs:
SS7 error messages are logged to the file
/tmp/DPE_LOG/ss7trace.log
use the tool tv_ansi/itu to translate specific error
messages, or to translate an entire log file, to a human
readable format.

Check SS7 Configuration


Configuration Summary Table

Narrowband

Broadband

Sigtran

MTP L2

SAAL Link

SCTP

ss7_mtpl2_link
(Uses trunk and ts)

atm_vc, ss7_saal_link
(Uses ATM PVC, VPI/VCI)

ip_service, ip_service_address
ss7_sctp_end_point

MTP L3 Link

MTP L3 Link

SCTP Association

MTP L3 Linkset

MTP L3 Linkset

M3UA Association

ss7_mtpl3_linkset

ss7_mtpl3_linkset

ss7_m3ua_association

MTP L3 Route

MTP L3 Route

M3UA Route

ss7_mtpl3_route

ss7_mtpl3_route

ss7_m3ua_route

ss7_mtpl3_link_nb

ss7_mtpl3_link_bb

ss7_m3ua_remote_ipaddress

MTP L3 Routeset
ss7_mtpl3_routeset

Remote Point Codes and Remote SAPs


ss7_sccp_remote_point, ss7_sccp_remote_sap

IP-based Interface Problem

Review of SGSN-MME IP Service Structure


SGSN-MME

An IP Service on the SGSN-MME consists of the following entities:


An IP Service Address
An internal SGSN-MME VPN
SGSN-MME Router Instances
IP Interfaces (Either ATM or Ethernet)

Check Alarms
IP interfaces can utilize either ATM or Ethernet connections, so the following alarms
may be relevant:

ethAutoNegFailed
ethLinkDown
atmConfigurationMismatch
atmLBCellsMissing
atmLineAlarmIndicationSignal
atmLineRemoteDefectIndication
atmLossOfFrame
atmLossOfPointer
atmLossOfSignal
atmPathAlarmIndicationSignal
atmPathRemoteDefectIndication
atmVCAlarmIndicationSignal
atmVCRemoteDefectIndication
atmVPAlarmIndicationSignal
atmVPRemoteDefectIndication

The Gn-C, Gn-U and Iu-U interfaces utilize GTP protocol, so the following GTP failure
alarms are relevant for those interfaces:
gtpPathFailureControlPlane
gtpPathFailureUserPlane
gtpGgsnBlacklisted

gtpGSNrestarted
gtpHangingPdpContextInGgsnDeleted
gtpServiceNotConfigured

Example Output from dig Tool


(not a sgsn tool, not available on your node)
=== sysadm@eqm01s14p2 ANCB ~ # dig ipmm2.mnc020.mcc440.gprs
; <<>> DiG 8.3 <<>> ipmm2.mnc020.mcc440.gprs
;; res options: init recurs defnam dnsrch
;; got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUERY SECTION:
;;
ipmm2.mnc020.mcc440.gprs, type = A, class = IN
;; ANSWER SECTION:
ipmm2.mnc020.mcc440.gprs.

1D IN A

10.0.46.1

;; AUTHORITY SECTION:
gprs.

1D IN NS

gprsevdns.miscnet.stp.

;; ADDITIONAL SECTION:
gprsevdns.miscnet.stp.

1D IN A

138.85.81.189

;;
;;
;;
;;

Total query time: 15 msec


FROM: eqm01s14p2 to SERVER: default -- 169.254.4.2
WHEN: Tue Jun 30 13:08:15 2009
MSG SIZE sent: 42 rcvd: 109

Here is the
dig command

Here is the
IP address
of the GGSN

Check Routing Tables for Gn Interface


Use the CLI commands shown below to display the routes that
are defined for each Gn Router Instance.
Replace the xs in the command with the network name and
equipment numbers for each Gn router instance.
Note that nw name for the Gn VPN may not be the same on all
SGSN-MME nodes.
gsh list_router_instance
gsh show_router_instance_ip_route -eqp x.x -nw Gn

Verify that at least one Gn Router instance has a route defined


to the IP address of the SGSN-MME or GGSN in question.
Use gsh list_ip_service_address to list the IP service
addresses defined on the SGSN-MME. Verify that the Gn
Router instances have routes defined to the IP service address
defined for the Gn-GTP-C service and Gn-GTP-U service.

Example Display of Routing Tables


=== sysadm@eqm01s14p2 ANCB ~ # gsh list_router_instance
ps Class
Identifiers
|
---------------------------------------------A router_instance
-eqp 1.6
-nw Gom
A router_instance
-eqp 1.7
-nw Gom
A router_instance
-eqp 2.11 -nw Iu
A router_instance
-eqp 2.12 -nw Iu
A router_instance
-eqp 2.14 -nw Gn
A router_instance
-eqp 2.15 -nw Gn

This is the IP address of


a GGSN in the Gn network.
These are the IP addresses
of the Gn Services.
IP Address of next hop.

=== sysadm@eqm01s14p2 ANCB ~ # gsh show_router_instance_ip_route -eqp 2.14 -nw Gn


ps Class
Identifiers
| ifn
use
gw
mask
----------------------------------------------------------------------------------------------------------A ip_route
-eqp 2.14 -nw Gn -dip 10.0.203.0
| ETH_2_14_0 0
10.0.70.1 255.255.255.0
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.40.0
| ETH_2_14_0 0
10.0.70.1 255.255.255.128
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.46.1
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.46.2
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.60.0
| ETH_2_14_0 0
10.0.70.1 255.255.255.128
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.60.128
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.70.0
| ETH_2_14_0 0
10.0.70.2 255.255.255.128
A ip_route
-eqp 2.14 -nw Gn -dip 10.0.70.136
| 3016 eq: 2.14
255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 172.2.6.0
| ETH_2_14_0 0
10.0.70.1 255.255.255.0
A ip_route
-eqp 2.14 -nw Gn -dip 172.20.105.49
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 172.20.105.65
| 616
eq: 2.14
255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 172.20.106.41
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
A ip_route
-eqp 2.14 -nw Gn -dip 172.20.106.49
| ETH_2_14_0 0
10.0.70.1 255.255.255.255
=== sysadm@eqm01s14p2 ANCB ~ # gsh list_ip_service_address | grep Gn
A ip_service_address
-sn Gn-GTP-C -ip 172.20.105.65
A ip_service_address
-sn Gn-GTP-U -ip 10.0.70.136

Check Routing for Iu-U Interface


Finding Destination IP from Interface Definition

Typically the Iu-U interface uses direct ATM


connections to the RNCs for IP connectivity. (Ethernet
connections and routed IP networks are supported for
Iu-U.)
For direct connections using IP over ATM, the
destination IP address for the RNCs packet data
routers are specified in the interface provisioning for
ATM links.
Use the command shown below to display the ATM
interface provisioning for the Iu-U interfaces, and the
remote IP addresses defined for those interfaces.
gsh list_ip_interface -ip \* -rip \* -nw \*

*Add your values

\ creates columns

Example Display of Interface Info


=== sysadm@eqm01s14p2 ANCB ~ # gsh list_ip_interface -ip \* -rip \* -nw \*
ps Class
Identifiers
| ip
rip
nw
-------------------------------------------------------------------------------A ip_interface
-ifn ATM_2_11_0_1_101
| 172.16.0.194
172.16.0.193
Iu
A ip_interface
-ifn ATM_2_11_0_1_202
| 172.16.0.198
172.16.0.197
Iu
A ip_interface
-ifn ATM_2_11_0_1_302
| 10.0.0.54
10.0.0.53
Iu
A ip_interface
-ifn ATM_2_11_1_0_500
| 172.26.240.50 172.26.240.51 Iu
A ip_interface
-ifn ATM_2_12_1_2_64
| 10.0.0.69
10.0.0.70
Iu
A ip_interface
-ifn ATM_2_12_1_2_65
| 10.0.0.73
10.0.0.74
Iu
A ip_interface
-ifn ATM_2_12_1_2_66
| 10.0.0.77
10.0.0.78
Iu
A ip_interface
-ifn ATM_2_12_1_2_67
| 10.0.0.81
10.0.0.82
Iu
A ip_interface
-ifn ATM_2_12_1_2_68
| 10.0.0.85
10.0.0.86
Iu
A ip_interface
-ifn ATM_2_12_1_2_69
| 10.0.0.89
10.0.0.90
Iu
A ip_interface
-ifn ATM_2_12_1_2_70
| 10.0.0.93
10.0.0.94
Iu
A ip_interface
-ifn ATM_2_12_1_2_71
| 10.0.0.97
10.0.0.98
Iu
A ip_interface
-ifn ETH_1_6_0
| 10.0.72.2
NULL
Gom
A ip_interface
-ifn ETH_1_7_0
| 10.0.72.3
NULL
Gom
A ip_interface
-ifn ETH_2_14_0
| 10.0.70.2
NULL
Gn
A ip_interface
-ifn ETH_2_15_0
| 10.0.70.3
NULL
Gn

Check Routing Tables for Iu-U Interface


Use the CLI commands shown below to display the routes
that are defined in each Iu Router instance.
Replace the xs in the command with the network name and
equipment numbers for each Iu router instance.
Note that nw name for the Iu VPN may not be the same on all
SGSN-MME nodes.
gsh list_router_instance
gsh show_router_instance_ip_route -eqp xx -nw Iu

Verify that the Iu Router instances have a routes defined to


the IP address specified in the interface definitions.
Use gsh list_ip_service_address to list the IP service
addresses defined on the SGSN-MME. Verify that the Iu
Router instances have routes defined to the IP service
address defined for the Iu-GTP-U service.

Check Packet Filter Definitions


Use the commands shown below to list all inbound packet filter
policies, and inbound packet filter rules.
Verify that a policy and rule are defined for the interface in
question.
If no rule is defined for an interface, the default behavior is to block
all packets.
Are there any packet filter rules defined that could be blocking
packets and causing the problem?
gsh list_inbound_pf_policy
gsh list_inbound_pf_rule -r \*

Repeat the same procedure for outbound packet filters


gsh list_outbound_pf_policy
gsh list_outbound_pf_rule -r \*

*Add your values (permit.deny)

\ creates columns

Example Display of Packet Filter Rules


=== sysadm@eqm01s14p2 ANCB ~ # gsh list_inbound_pf_rule -r \*
ps Class
Identifiers
| r
---------------------------------------------------------------A inbound_pf_rule
-ifp ATM_2_11_0_1_101 -fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_11_0_1_202 -fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_11_0_1_302 -fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_11_1_0_500 -fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_64
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_65
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_66
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_67
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_68
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_69
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_70
-fr 1
| permit
A inbound_pf_rule
-ifp ATM_2_12_1_2_71
-fr 1
| permit
A inbound_pf_rule
-ifp ETH_1_6_0
-fr 1
| permit
A inbound_pf_rule
-ifp ETH_1_7_0
-fr 1
| permit
A inbound_pf_rule
-ifp ETH_2_14_0
-fr 1
| permit
A inbound_pf_rule
-ifp ETH_2_15_0
-fr 1
| permit
A inbound_pf_rule
-ifp inbound_deny_all -fr 1
| deny

Gb over Frame Relay Interface Problem

Gb Review: PVCs, NSEs and NSVCIs

NSE - Network Services Entity used to represent peer


entities in a BSC and SGSN-MME. Used for both
GbFR and GbIP

NSEI = 1

GbFR

SGSN-MME

Example: The volume of data being exchanged


between a BSC and an SGSN-MME requires 3
E1/T1 trunks.
For each trunk, an E1/T1 fractions will be defined to
bind together the 32/24 timeslots into one carrier to
be used by Frame Relay. (Total of 3 E1/T1
fractions.)

For each trunk, a Frame Relay PVC will be defined.


(A total of 3 FR PVCs.)

For each trunk, an NSVC will be defined. (A total of


3 NSVCs.)

These NSVCs will be associated with each other by


their association with a single NSE.

GbIP does not use E1/T1 trunks, FR PVCs or NSVCs


for connectivity. Connectivity between SGSN-MME
and BSC is achieved through pairs of IP endpoints
associated with the peer NSEs.

NSVCI
101
NSVCI
102
NSVCI
103

BSC
NSEI = 1

Gb Review: BVCs
SGSN-MME

A BSSGP Virtual Connection (BVC) represents a logical connection between the SGSNMME and the BTS/Cell.
BVCs for a cell are auto-configured on the BSC and SGSN-MME when GPRS is
activated on the cell.
BVCs are used in GbFR and GbIP.

Check Alarms
The alarms listed below are relevant for Gb Interface problems. Consult
the fault tracing directions in Alex for these alarms. The list contains
GbFR and GbIP Alarms. Unmarked alarms apply only to GbFR.

gbipNseAvailabilityDecreased
gbipNseUnavailable
pcmE1T1LossOfSignal
pcmE1T1LossOfFrame
frPvcDown
nsNsAliveFailed
(Both)
nsNsBlockRetriesExceeded
nsNSBlockWrongNsvci
nsNsResetRetriesExceeded
nsNsResetWrongNsei
nsNsResetWrongNsvci
nsNsStatusReceived
nsNsUnblockBlockReceived
nsNsUnblockDeadNsvc
nsNsUnblockRetriesExceeded
nsNsUnblockWrongNsvci
bssgpNsCongestion
bssgpBvcResetRetriesExceeded

gbipNseConfigProcedureFailed

(GbIP only) gbipNseNsStatusReceived

gbipNseSizeprocedureFailed
pcmE1T1AlarmIndicationSignal
pcmE1T1LossOfFrame
pcmE1T1LossOfSignal
pcmE1T1RemoteAlarmIndication
pcmE1T1RemoteLoopbackActivated

(Both)
(Both)

GbFR: Connectivity Check - NSVC Status


Use the command gsh list_nsvcs to display a list of all
NSVCs defined on the SGSN-MME.
Use the command gsh get_nsvc to display the provisioning
data and status of a specific NSVC. Example output is
shown below.
=== sysadm@eqm01s14p2 ANCB log/eqm01s14p2 # gsh list_nsvcs
300
301
303
=== sysadm@eqm01s14p2 ANCB log/eqm01s14p2 # gsh get_nsvc 300
NS-VCI
: 300
NSEI
: 2
Board
: NE/Magazine 2/Slot 2
Equipment Identifier (M,S,P,F) : 2,2,1,1
E1/T1 trunk
: 1
Fraction
: 0
DLCI
: 30
Blocking State
: deblocked
Operational State
: alive

GbFR: NSVC Block/Unblock Procedure

SGSN-MME

The Block/Unblock procedure can be initiated from either side BSC or SGSN-MME.
The Block procedure is a graceful shutdown of the NSVC. The side initiating the block
must continue to accept packets until the block-ack message is received from the peer.
If the block/unblock timer expires before the block-ack is received from the peer, the block
procedure is started again, with a provisioned limit on the number of retries.
Use CLI commands: gsh block_nsvc gsh deblock_nsvc.

GbFR: NSVC Reset Procedure

SGSN-MME

The NSVC Reset procedure can be initiated from either side BSC or SGSN-MME.

The NSVC Reset is used when setting up a new NSVC, or after an NSVC failure.

The NSVC Reset blocks the NSVC, then initiates the NSVC Test procedure

At the end of the NSVC Reset procedure, the NSVC will be in blocked/alive state.

Use CLI command

gsh reset_nsvc

GbFR and GbIP: NS Test Procedure

SGSN-MME

The NS Test procedure is used to verify end to end communication with a peer NSE. For GbFR, it is
used to test individual NSVCs. For GbIP it is used to test paths between multiple IP endpoints.

The NS Test procedure can be initiated from either side BSC or SGSN-MME.

For GbFR, the NS Test procedure is initiated after the NSVC Reset procedure. For GbIP, the NS
Test procedure is initiated after auto-config procedures are complete. For both GbIP and GbFR, the
test procedure is repeated on a periodic basis to monitor communication between peer NSEs.

GbIP Connectivity Check


BSCs are not provisioned when using GbIP. To list the BSCs that are
present, use the list_nses -a CLI command. To get more information
about the BSC, use the list_ra CLI command to find the routing area.
NSVCs are not provisioned when using GbIP. To see if connectivity to
an NSE is up, use the get_nse <nse_number> CLI command. It will
display the local IP endpoints, the remote IP endpoints and the
connectivity status of each remote IP endpoint as shown below.
> gsh get_nse 30801
NSEI
: 30801
Local IP-end-points
:
172.29.64.34:34916, SW=1, DW=0
172.29.64.34:2158, SW=0, DW=1
Remote IP-end-points
:
172.29.76.108:45000, SW=42, DW=42, status=ok
172.29.76.107:45000, SW=42, DW=42, status=ok
172.29.76.106:45000, SW=42, DW=42, status=ok
172.29.76.105:45000, SW=42, DW=42, status=ok
172.29.76.104:45000, SW=42, DW=42, status=ok
172.29.76.103:45000, SW=42, DW=42, status=ok
172.29.76.102:45000, SW=42, DW=42, status=ok

Check Cell: BVC Status


BVC status commands are identical for GbFR and GbIP
Use one of the commands below to list the status of
cells/BVCs associated with a BSC:
gsh list_bvcs -bsc <bsc_name>
gsh list_bvcs -nse <nsei>

Sample output is shown below:


=== sysadm@eqm01s14p2 ANCB log/eqm01s14p2 # gsh list_bvcs -bsc BSC7
PTP BVC [NSEI-BVCI] Cell
Operational State
Blocking State
2-1019
440-20-601-1-1
available
deblocked
2-1021
311-03-620-1-2
available
blocked
2-1022
311-03-620-1-1
available
blocked
2-1024
311-03-601-1-23
available
blocked
2-1025
311-03-601-1-24
available
deblocked

BSC Name
BSC7
BSC7
BSC7
BSC7
BSC7

BVC Actions: Reset Procedure

SGSN-MME

A BVC reset can be initiated by the SGSN-MME or the BSC.

The resetting side must continue to accept PDUs until a reset-ACK is received.
After a successful reset, the SGSN-MME will assume that all BVCs are unblocked. The
BSC will initiate blocking for any BVCs that should be blocked. (BVC blocking can only
be initiated by the BSC)
Use CLI command reset_bvc

Check interface, SCTP


The SCTP associations can be checked with the command:
/tmp/DPE_SC/LoadUnits/ttx/int/bin/sctp_status
SCTP Status
------------------------------------------------------------------------------------sn
eqp
aid
epid AssocState
RemotePort RemoteIP
PathStatus
SRTT
------------------------------------------------------------------------------------S1-MME 1.10 1
2
ESTABLISHED 36422
10.75.16.42
ACTIVE|ACTIVE 14|14
S1-MME 1.10 2
2
ESTABLISHED 36422
10.75.16.72
ACTIVE|ACTIVE 14|14
S1-MME 1.10 5
2
ESTABLISHED 45346
10.90.10.43
ACTIVE|ACTIVE 18|10
S1-MME 1.11 15
2
ESTABLISHED 58661
10.90.10.45
ACTIVE|ACTIVE 10|10
S6a
1.11 23
3
ESTABLISHED 3868
10.42.82.241 ACTIVE|ACTIVE 97|70
------------------------------------------------------------------------------------SCTP EQ Distribution
-----------------------------------------------------------------------sn
no epid LocalPort IPaddress
1.10 1.11
Total
-----------------------------------------------------------------------S1-MME 1 2
36412
10.64.193.75|10.64.193.76
3
1
4
S6a
2 3
3868
10.64.193.139|10.64.193.140
1
1
------------------------------------------------------------------------

Check interface, enodeb


Show all eNodeBs
gsh show_mme_enodeb
ps Class
Identifiers
| name
ens
date
eqp
aid
--------------------------------------------------------------------------------------A enodeb
-eni 107001
| kienb7001 connected
2010-09-01,11:09:18 1.10 7
A enodeb
-eni 107005
| kienb7005 connected
2010-09-01,08:50:52 1.11 11
A enodeb
-eni 107008
| kienb7008 disconnected 2010-08-31,11:47:20 0.0
NULL

date: When eNodeB was connected/disconnected


eqp: SCTP board
aid: SCTP Association Id
Use eqp+aid to show detailed SCTP association info,
e.g. eNodeB IP-address:
gsh show_sctp_assstat -eqp 1.11 -aid 2

Check interface, enodeb


/tmp/DPE_SC/LoadUnits/ttx/int/bin//sctp_status -f S1-MME
SCTP Status
-----------------------------------------------------------------------------------------------------------------sn
eqp aid epid eni
name
AssocState RemotePort RemoteIP
PathStatus
SRTT
-----------------------------------------------------------------------------------------------------------------S1-MME 1.10 1
2 200528 lienb0528
ESTABLISHED 36422
10.75.16.42
ACTIVE|ACTIVE 14|14
S1-MME 1.10 2
2 200543 lienb0543
ESTABLISHED 36422
10.75.16.72
ACTIVE|ACTIVE 14|14
S1-MME 1.10 5
2 2enb_youlabdallas06_2 ESTABLISHED 45346
10.90.10.43 ACTIVE|ACTIVE 18|10
S1-MME 1.11 15 2 4enb_youlabdallas06_4 ESTABLISHED 58661
10.90.10.45 ACTIVE|ACTIVE 10|10
-----------------------------------------------------------------------------------------------------------------SCTP EQ Distribution
-----------------------------------------------------------------------sn
no epid LocalPort IPaddress
1.10 1.11
Total
-----------------------------------------------------------------------S1-MME 1 2 36412
10.64.193.75|10.64.193.76
3 1
4
S6a 2 3 3868
10.64.193.139|10.64.193.140 - 1
1
-----------------------------------------------------------------------Aug-27 10:26 2010

This info is extracted from the gsh show_mme_enodeb command.

CHECK INTERFACE, TRACKING AREAS


Show all TAs in MME
gsh show_mme_ta
ps Class Identifiers
|
-------------------------------------------A ta -mcc 240 -mnc 099 -tac 100
The MME is informed of the to eNodeB connected Tracking Areas in the S1-Setup message.
Show all TAs in one eNodeB
gsh show_enodeb_supported_ta -eni 1
ps Class
Identifiers
|
-------------------------------------------------------------A supported_ta -eni 1 -mcc 240 -mnc 099 -tac 100
Show all eNodeBs that handle a TA
gsh show_ta_supporting_enodeb -mcc 240 -mnc 099 -tac 100
ps Class
Identifiers
|
--------------------------------------------------------------------A supporting_enodeb -mcc 240 -mnc 099 -tac 100 -eni 1
A supporting_enodeb -mcc 240 -mnc 099 -tac 100 -eni 2
A supporting_enodeb -mcc 240 -mnc 099 -tac 100 -eni 3

Several eNodeB can support the same Tracking Area.

Connectivity Check
Its possible to use ping and traceroute commands,
on MKV and MKVI, to check IP connectivity towards
external nodes like S-GW.
Ping and traceroute will be run on the active NCB.

IP Connectivity Check towards S-GW


MME uses the service S11-GTP-C when communicating
with S-GWs.
Show S11-GTP-C to IP address mapping.
=== sysadm@eqm01s13p2 ANCB ~ # gsh list_ip_service_distribution -sn S11-GTP-C
ps Class
Identifiers
|
----------------------------------------------------------------------------------------A ip_service_distribution
-sn S11-GTP-C -nw S11 -eqp 1.12 -ip 10.152.254.2
A ip_service_distribution
-sn S11-GTP-C -nw S11 -eqp 1.13 -ip 10.152.254.2
A ip_service_distribution
-sn S11-GTP-C -nw S11 -eqp 1.14 -ip 10.152.254.2
A ip_service_distribution
-sn S11-GTP-C -nw S11 -eqp 1.15 -ip 10.152.254.2
A ip_service_distribution
-sn S11-GTP-C -nw S11 -eqp 1.16 -ip 10.152.254.2

10.152.254.2 must be used as source IP address when


pinging the S-GW. This IP address must be configured
on the NCB and associated with the S11 network.

Create IP Address on NCB


The IP address 10.152.254.2 will be assigned to the service
Filterlog that is hosted on the NCB.
This is a trick to get the necessary IP address created on the
NCB.
=== sysadm@eqm01s13p2 ANCB ~ # gsh create_ip_service -sn Filterlog -nw S11
=== sysadm@eqm01s13p2 ANCB ~ # gsh create_ip_service_address -sn Filterlog -ip 10.152.254.2
=== sysadm@eqm01s13p2 ANCB ~ # gsh check_config
=== sysadm@eqm01s13p2 ANCB ~ # gsh activate_config_pending
=== sysadm@eqm01s13p2 ANCB ~ # ifconfig -a
*
fe0:1
Link encap:Point-to-Point Protocol
inet addr:10.7.1.52 P-t-P:10.7.1.52 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MTU:1500 Metric:1
fe0:2

Link encap:Point-to-Point Protocol


inet addr:10.152.254.2 P-t-P:10.152.254.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MTU:1500 Metric:1

Important note
SGSN Reconfiguration of IP services break IP interface
Several IP services are using the same IP address on the active NCB:
A ip_service_distribution -sn CDR-FTP -nw Gn
10.10.11.193
A ip_service_distribution -sn DNS
-nw Gn
10.10.11.193
A ip_service_distribution -sn Filterlog -nw Gn
10.10.11.193

-eqp 0.0 -ip


-eqp 0.0 -ip
-eqp 0.0 -ip

The IP address 10.10.11.193 is bound to the interface fe0:1 on the active


NCB. The interface fe0:1 gets broken when deleting e.g. the filterlog
configuration. Broken => no more outbound traffic is possible.

Find IP Address for S-GW


S-GW IP address can possibly be found in the DNS cache on
the active NCB.
=== sysadm@eqm01s13p2 ANCB ~ #/tmp/DPE_SC/Tools/rndc -c
/tmp/DPE_SC/ApplicationData/dnsApp/rndc.conf dumpdb
=== sysadm@eqm01s13p2 ANCB ~ #grep sgw
/tmp/DPE_ROOT/SiteSpecificData/ApplicationSpecific/dnsApp/named_dump.db
sgw.eth1.gw1.gbg.net.epc.mnc099.mcc240.3gppnetwork.org. 595 A
10.152.32.17

Ping and Traceroute towards S-GW


=== sysadm@eqm01s13p2 ANCB ~ # ping -I 10.152.254.2 10.152.32.17
PING 10.152.32.17 (10.152.32.17) from 10.152.254.2 : 56(84) bytes
64 bytes from 10.152.32.17: icmp_seq=1 ttl=62 time=0.155 ms
64 bytes from 10.152.32.17: icmp_seq=2 ttl=62 time=0.139 ms
64 bytes from 10.152.32.17: icmp_seq=3 ttl=62 time=0.136 ms
64 bytes from 10.152.32.17: icmp_seq=4 ttl=62 time=0.144 ms
64 bytes from 10.152.32.17: icmp_seq=5 ttl=62 time=0.144 ms
64 bytes from 10.152.32.17: icmp_seq=6 ttl=62 time=0.135 ms
64 bytes from 10.152.32.17: icmp_seq=7 ttl=62 time=0.139 ms
64 bytes from 10.152.32.17: icmp_seq=8 ttl=62 time=0.142 ms

of data.

Note difference how Source IP is


specified:
-I option for ping
-s option for traceroute

-I flag for traceroute means that


ICMP shall be used as probes
instead of UDP
=== sysadm@eqm01s13p2 ANCB ~ # traceroute -I -s 10.152.254.2 10.152.32.17
traceroute to 10.152.32.17 (10.152.32.17), 30 hops max, 38 byte packets
1 * * *
2 10.152.16.10 (10.152.16.10) 0.419 ms 0.298 ms 0.281 ms
3 10.152.32.17 (10.152.32.17) 0.136 ms 0.116 ms 0.117 ms

SGSN-MMME
Troubleshooting
Subscriber Tracing

Objectives

Upon the completion of this chapter, the student will be able to:
Trace Subscribers using commands and log files
Understand and use Integrated Traffic Capture (ITC) on
supported interfaces
Understand the concept of capturing traffic from each
interface
Describe the capture process, storage, filters, limitations and
improvements
Initiate the ITC and read the files

gsh list_subscribers
The gsh list_subscribers CLI command lists all, or a subset of all,
subscribers that are currently registered in the SGSN-MME. The output
is sent to a built-in log.
The log is stored in /tmp/OMS_LOGS/list_subscribers_result.
The file name is list_subscribers_result.*

The gsh list_subscribers CLI command is capable of using filters,


such as the IMSI, MSISDN, and IMEI. It is also capable of sorting the
results by the IMSI, MSISDN number, or IMEI.
The command can take a long time to run if there are a lot of
subscribers. It can be stopped using the -abort option.
Also, the function may be aborted by the SGSN-MME due to system
overload. This will be indicated with a message in the result log file.
Syntax is as follows:
gsh list_subscribers [[-imsi ImsiPfx | -msisdn MsisdnPfx | -imei
ImeiPfx] [-sort SortBy]] | [-abort]

gsh list_subscribers output


-----------------------------------------------------------RESULT OF LIST SUBSCRIBERS
-----------------------------------------------------------Time: 2009-02-19 11:46:24
Input: list_subscribers
User: sysadm
-----------------------------------------------------------SUBSCRIBER DETAILS
IMSI
MSISDN
IMEI
-----------------------------------------------------------311030675001156
12146751156
unknown
311030675001152
12146751152
unknown
311030675001180
12146751180
unknown
311030675001153
12146751153
unknown
311030675001186
12146751186
unknown
311030675001171
12146751171
unknown
311030675001117
12146751117
unknown
440200675001215
12146751215
unknown
311030675001100
12146751100
unknown
311030675001158
12146751158
unknown
311030675001107
12146751107
unknown
311030675001104
12146751104
unknown
311030675001102
12146751102
unknown
-----------------------------------------------------------SUBSCRIBER STATISTICS
Total number of printed subscribers
:
13
Total number of registered subscribers :
13
-----------------------------------------------------------Time: 2009-02-19 11:46:25
END

gsh list_subscribers with filter


imsi, msisdn or imei

=== sysadm@eqm01s14p2 ANCB list_subscribers_result/ready # gsh list_subscribers -imsi 440


LIST_SUBSCRIBERS EXECUTION STARTED
THE RESULT WILL BE WRITTEN TO FILE
=== sysadm@eqm01s14p2 ANCB list_subscribers_result/ready # cat list_subscribers_result.7
-----------------------------------------------------------RESULT OF LIST SUBSCRIBERS
-----------------------------------------------------------Time: 2009-02-19 12:33:20
Input: list_subscribers

-imsi 440

User: sysadm
-----------------------------------------------------------SUBSCRIBER DETAILS
IMSI
MSISDN
IMEI
-----------------------------------------------------------440200675001215
12146751215
unknown
-----------------------------------------------------------SUBSCRIBER STATISTICS
Total number of printed subscribers
:
1
Total number of registered subscribers :
13
-----------------------------------------------------------Time: 2009-02-19 12:33:20
END

gsh get_subscriber
Get information about a specific subscriber.
Specify one of the following identities for the subscriber:
imsi
msisdn
imei
ptmsi
tlli

Use the -dl option 1 or 2 to get additional information


on a subscriber.
example: gsh get_subscriber

-msisdn 12146751116

-dl

Example Output gsh get_subscriber


=== sysadm@eqm01s14p2 ANCB ~ # gsh get_subscriber -imsi 440200675001215
Subscriber Data
---------------------------------------------------------------------IMSI
: 440200675001215
Mobile Subscriber ISDN No. : 12146751215
IMEI
: Information not available
Roaming Status
: Home
HLR Address
: 12146264444
Home PLMN APN Operator Id : mnc020.mcc440.gprs
Subscribed Teleservices
: No SMS
Network Access Mode
: Packet/Circuit Switched
Radio Access Technology
: UMTS
Mobility Management State : PMM-DETACHED
Paging Proceed Flag
:
Routing Area [RAI]
:
P-TMSI
: 3765012293 (#E0698745)
MSC/VLR Address
: Not Gs connected
Location Confirmed in HLR : true
Data Confirmed by HLR
: true

Example Output get_subscriber (-dl option)

(1/4)

=== sysadm@eqm01s14p2 ANCB ~ # gsh get_subscriber -imsi 440200675001215 -a


Subscriber Data
---------------------------------------------------------------------IMSI
: 440200675001205
Mobile Subscriber ISDN No.
: 12146751205
IMEI
: Information not available
Roaming Status
: Home
HLR Address
: 12146264444
Home PLMN APN Operator Id
: mnc020.mcc440.gprs
Subscribed Teleservices
: No SMS
Network Access Mode
: Packet/Circuit Switched
Radio Access Technology
: UMTS
Mobility Management State
: PMM-IDLE
Paging Proceed Flag
: Set
Routing Area [RAI]
: 440-20-30-30
P-TMSI
: 3790942036 (#E1F52F54)
MSC/VLR Address
: Not Gs connected
Location Confirmed in HLR
: true
Data Confirmed by HLR
: true
Charging Characteristics
: #0000
Charging Characteristics Profile
: 0

Example Output get_subscriber (-dl option)


Subscribed PDP
---------------------------------------------------------------------Id
: 1
Type
: IPv4
Address
: Dynamic
Quality of service
:
allocation/retention priority
: level1
delay class
: class1
reliability class
: Unack: GTP,LLC. Ack: RLC.
Protected data
peak throughput (octet/s)
: up to 8000
precedence class
: high priority
mean throughput (octet/h)
: best effort
traffic class
: interactive
delivery order
: no
delivery of erroneous SDU
: no
maximum SDU size (octets)
: 1500
maximum bit rate for uplink (kbps)
: 64
maximum bit rate for downlink (kbps)
: 64
residual BER
: 1E-5
SDU error ratio
: 1E-4
transfer delay (ms)
: 0
traffic handling priority
: level1
guaranteed bit rate for uplink (kbps)
: 0
guaranteed bit rate for downlink (kbps) : 0
VPLMN allowed
: false
APN
: *
PDP Charging Characteristics
:
PDP Charging Characteristics Profile
:

(2/4)

Example Output get_subscriber (-dl option)

(3/4)

Active PDP
---------------------------------------------------------------------Id
: 1
NSAPI
: 5
Type requested
: IPv4
Address requested
: Dynamic
APN requested
: ipmm2
Addressing nature
: Dynamic
Address in use
: 192.168.253.140
APN in use
: ipmm2.mnc020.mcc440.gprs
GGSN in use
: 10.0.46.2
Quality of service requested
:
allocation/retention priority
: delay class
: 0
reliability class
: Unack: GTP,LLC. Ack: RLC. Protected Data
peak throughput (octet/s)
: 0
precedence class
: 0
mean throughput (octet/h)
: best effort
traffic class
: 0
delivery order
: 0
delivery of erroneous SDU
: 0
maximum SDU size (octets)
: 0
maximum bit rate for uplink (kbps)
: 0
maximum bit rate for downlink (kbps)
: 0
residual BER
: 0
SDU error ratio
: 0
transfer delay (ms)
: 0
traffic handling priority
: 0
guaranteed bit rate for uplink (kbps)
: 0
guaranteed bit rate for downlink (kbps) : 0

Example Output get_subscriber (-dl option)


Quality of service negotiated
allocation/retention priority
delay class
reliability class
Protected data
peak throughput (octet/s)
precedence class
mean throughput (octet/h)
traffic class
delivery order
delivery of erroneous SDU
maximum SDU size (octets)
maximum bit rate for uplink (kbps)
maximum bit rate for downlink (kbps)
residual BER
SDU error ratio
transfer delay (ms)
traffic handling priority
guaranteed bit rate for uplink (kbps)
guaranteed bit rate for downlink (kbps)

:
: level1
: class1
: Unack: GTP,LLC. Ack: RLC.
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:

up to 8000
high priority
best effort
interactive
no
no
1500
64
64
1E-5
1E-4
1000
level1
32
64

(4/4)

eci tool: Connection Information


The eci tool in the SGSN-MME toolbox provides connection
information. The following information is available from
the eci tool:
stats
Displays connection statistics for SGSN-MME or GGSN
dist
Displays distribution of connections over APs and DPs
list
Prints a list of subscribers in SGSN-MME. ** see note
below
details Prints connection details for a given subscriber

This command is issued at the unix prompt (not part of


the gsh shell) The format of the command is as follows:
eci stats
* WARNING: This tool may cause heavy CPU load and should not be run during high traffic
nor should not be used during start or restart of the node. Only for troubleshooting purposes.
** Note: Instead of using ci list, use cli command gsh list_subscribers. The cli command
protects against heavy system load from tool, and sends output to a log file.

eci stats for GSM & WCDMA


This is eci version 1.1.0 operating on an SGSN-MME '10B-00-00' (WG).
SGSN-MME-G connection statistics:
- 828206 SGSN-MME-G connections active
( 45.85
- 549393 SGSN-MME-G connections attached ( 30.41
- 415349 SGSN-MME-G connections idle
( 22.99
13512 SGSN-MME-G connections unstable ( 0.97
--------------------------------------------- 1806460 SGSN-MME-G connections in total.

%).
%).
%).
%)*.

SGSN-MME-W connection statistics:


- 355167 SGSN-MME-W connections active
( 45.62
- 243193 SGSN-MME-W connections attached ( 31.23
- 173553 SGSN-MME-W connections idle
( 22.29
6681 SGSN-MME-W connections unstable ( 1.10
--------------------------------------------- 778594 SGSN-MME-W connections in total.

%).
%).
%).
%)*.

eci stats for LTE


This is eci version 1.1.0 operating on an SGSN-MME '10B-01-00' (L).
MME connection statistics:
- 593379 MME connections active (registered) ( 94.47 %).
34380 MME connections idle (deregistered) ( 5.47 %).
337 MME connections unstable
( 0.06 %)*.
--------------------------------------------- 628096 MME connections in total.

**NOTE:
State active = EMM-REGISTERED, both ECM-IDLE and ECM-CONNECTED!
State idle
= EMM-DEREGISTERED (do not confuse with ECM-IDLE!)
State unstable means that signaling is ongoing for the UE
State attached is not used for LTE, only for GSM and WCDMA

eci dist for GSM & WCDMA


This is eci version 1.1.0 operating on an SGSN-MME '10B-00-00' (WG).
Distribution of SGSN-MME-G connections over GPBs:
GPB
active
attached
idle
unstable
total
replica
--------------------------------------------------------------------------------1.10.2.1 42016
27571
20598
698
90883 ( 5.0 %) 64961 ( 4.7 %)
1.12.2.1 42115
28075
20101
658
90949 ( 5.0 %) 67434 ( 4.9 %)
1.13.2.1 41925
27829
20259
722
90735 ( 5.0 %) 67950 ( 4.9 %)
1.14.2.1 42121
27897
20062
733
90813 ( 5.0 %) 67472 ( 4.9 %)
1.15.2.1 42117
27766
20330
703
90916 ( 5.0 %) 67967 ( 4.9 %)
1.16.2.1 42032
27709
20161
668
90570 ( 5.0 %) 68212 ( 4.9 %)
*
Distribution of SGSN-MME-W connections over GPBs:
GPB
active
attached
idle
unstable
total
replica
--------------------------------------------------------------------------------1.10.2.1 18131
12186
8416
314
39047 ( 5.0 %) 28353 ( 4.7 %)
1.12.2.1 17977
12240
8338
354
38909 ( 5.0 %) 29565 ( 4.9 %)
1.13.2.1 18037
12362
8517
344
39260 ( 5.0 %) 29413 ( 4.9 %)
1.14.2.1 18085
12306
8399
352
39142 ( 5.0 %) 29900 ( 4.9 %)
1.15.2.1 17926
12150
8581
333
38990 ( 5.0 %) 29486 ( 4.9 %)
1.16.2.1 18085
12551
8410
350
39396 ( 5.1 %) 29737 ( 4.9 %)
*

Check that there is an even distribution of attached and activated connection over the APs.

eci dist for LTE


This is eci version 1.1.0 operating on an SGSN-MME '10B-01-00' (L).
Distribution of MME connections over GPBs:
GPB

active
idle
unstable
total
replica
(registered)
(deregistered)
----------------------------------------------------------------------------------1.12.2.1 54655
3247
34
57936 ( 9.2 %) 55710 ( 9.4 %)
1.13.2.1 53110
3621
44
56775 ( 9.0 %) 57980 ( 9.7 %)
1.14.2.1 53357
3120
34
56511 ( 9.0 %) 56604 ( 9.5 %)
1.15.2.1 53811
3128
44
56983 ( 9.1 %) 57201 ( 9.6 %)
1.16.2.1 53112
371
33
53516 ( 8.5 %) 25351 ( 4.3 %)
1.19.2.1 0
0
0
0
( 0.0 %) 0
( 0.0 %)
1.20.2.1 0
0
0
0
( 0.0 %) 0
( 0.0 %)
1.6.2.1
53503
3012
30
56545 ( 9.0 %) 57883 ( 9.7 %)
1.7.2.1
54934
3165
36
58135 ( 9.3 %) 56809 ( 9.6 %)
2.13.2.1 53925
3940
38
57903 ( 9.2 %) 56486 ( 9.5 %)
2.3.2.1
53660
4269
35
57964 ( 9.2 %) 56790 ( 9.5 %)
2.4.2.1
54857
3057
37
57951 ( 9.2 %) 56816 ( 9.6 %)
2.5.2.1
54418
3755
50
58223 ( 9.3 %) 57168 ( 9.6 %)
----------------------------------------------------------------------------------Sum
593342
34685
415
628442
594798

Subscriber Event Recording Overview


The Subscriber Event Recording feature provides the capability to
record mobility management and session management events for a
specific subscriber on the SGSN-MME.
The following types of events can be included in the trace:

Attach events
Detach Events
Cell updates
Routing Area Updates
PDP Context Activation/Deactivation/Update
Service Request
SW Errors

The events recorded are sent to the built-in log file named
er_data_log. The log files are located in the following directory:
/tmp/OMS_LOGS/er_data_log

Event Recording Session Commands


An event recording is defined for a single user. The user can be specified
by IMSI or by MSISDN.
Multiple event recordings can be active at the same time, so the SGSN-MME
can be tracing multiple users simultaneously.
Event recording session is started, modified, displayed, and deleted by
using the following CLI commands:

gsh
gsh
gsh
gsh
gsh

create_event_rec_session to start an event recording for a subscriber


set_event_rec_session to modify and event recording
get_event_rec_session to display the attributes of an event recording
list_event_rec_sessions to display all subscribers with active recording
delete _event_rec_session to delete a recording

Either the IMSI or an MSISDN of the subscriber can be used in the create
command to start the event recording session.
An example of creating, viewing and modifying an event recording session
is shown in the following slides.

Example Commands
# gsh create_event_rec_session -imsi 440200675001206 exampleSession -att -det -cu -rau -pdpcu -sr -swe
# gsh list_event_rec_sessions
IMSI,440200675001206
# gsh get_event_rec_session -imsi 440200675001206
Subscriber Identity
: IMSI,440200675001206
GPRS Attach
: true
GPRS Detach
: true
Cell Update
: true
Routing Area Update
: true
PDP Context Update
: true
Service Request
: true
SW Error
: true
Session Identifier
: exampleSession
# gsh set_event_rec_session -imsi 440200675001206 -cu false
# gsh get_event_rec_session -imsi 440200675001206
Subscriber Identity
: IMSI,440200675001206
GPRS Attach
: true
GPRS Detach
: true
Cell Update
: false
Routing Area Update
: true
PDP Context Update
: true
Service Request
: true
SW Error
: true
Session Identifier
: exampleSession
# gsh delete_event_rec_session -imsi 440200675001206

Event Recording: Example Log File (1/3)


2009-07-04 15:51:39 exampleSession Event name: ra_update_completed ; Event details: Routing Area
Update type periodic, intra_rau ; Cause value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ;
NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA
2009-07-04 16:25:39 exampleSession Event name: ms_not_reachable ; Event details: - ; Cause value: - ;
IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access
Type: WCDMA
2009-07-04 16:32:46 exampleSession Event name: deactivate_pdp_failed ; Event details: Deactivation
trigger ggsn ; Cause value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: 5 ;RAI: 440-2030-30 ; CGI: - ; Radio Access Type: WCDMA.
2009-07-04 17:25:39 exampleSession Event name: detach ; Event details: Detach type -, implicit ; Cause
value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio
Access Type: WCDMA
2009-07-05 07:22:08 exampleSession Event name: attach_completed ; Event details: Attach type
gprs_attach ; Cause value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: - ; RAI: 440-20-3030 ; CGI: - ; Radio Access Type: WCDMA
2009-07-05 07:52:10 exampleSession Event name: ra_update_completed ; Event details: Routing Area
Update type periodic, intra_rau ; Cause value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ;
NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA
2009-07-05 08:22:11 exampleSession Event name: ra_update_completed ; Event details: Routing Area
Update type periodic, intra_rau ; Cause value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ;
NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA

Event Recording: Example Log File (2/3)


2009-07-05 08:28:07 exampleSession Event name: service_request ; Event details: Service type signalling ; Cause
value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access
Type: WCDMA
2009-07-05 08:28:08 exampleSession Event name: activate_pdp ; Event details: - ; Cause value: - ; IMSI:
440200675001206 ; MSISDN: 12146751206 ; NSAPI: 5 ;RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA.
2009-07-05 08:32:20 exampleSession Event name: service_request ; Event details: Service type data ; Cause
value: - ; IMSI: 440200675001206 ; MSISDN: 12146751206 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access
Type: WCDMA

Event Recording: Example Log File (3/3)


2008-04-30 16:26:56 test Event name: attach_completed ; Event details: Attach type gprs_attach ; Cause value: - ;
IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA
2008-04-30 16:27:57 test Event name: activate_pdp_failed ; Event details: - ; Cause value: #38 (network_failure) ;
IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: 5 ;RAI: 440-20-30-30 ; CGI: - ; Radio Access Type:
WCDMA.
2008-04-30 16:57:58 test Event name: ra_update_completed ; Event details: Routing Area Update type periodic,
intra_rau ; Cause value: - ; IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ;
Radio Access Type: WCDMA
2008-04-30 17:27:59 test Event name: ra_update_completed ; Event details: Routing Area Update type periodic,
intra_rau ; Cause value: - ; IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ;
Radio Access Type: WCDMA
2008-04-30 17:31:00 test Event name: detach ; Event details: Detach type gprs_detach, ms_initiated ; Cause value: - ;
IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA
2008-05-01 09:46:04 test Event name: attach_completed ; Event details: Attach type gprs_attach ; Cause value: - ;
IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA
2008-05-01 09:46:23 test Event name: activate_pdp ; Event details: - ; Cause value: - ; IMSI: 311030675001131 ;
MSISDN: 12146751131 ; NSAPI: 5 ;RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA.
2008-05-01 10:20:44 test Event name: deactivate_pdp ; Event details: Deactivation trigger ms ; Cause value: - ; IMSI:
311030675001131 ; MSISDN: 12146751131 ; NSAPI: 5 ;RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA.
2008-05-01 10:20:44 test Event name: detach ; Event details: Detach type gprs_detach, ms_initiated ; Cause value: - ;
IMSI: 311030675001131 ; MSISDN: 12146751131 ; NSAPI: - ; RAI: 440-20-30-30 ; CGI: - ; Radio Access Type: WCDMA

Cell and UE Trace


Cell and UE trace only available for SGSN-MME (L)
Cell trace gathers subscribers permanent IDs i.e. IMSI and
IMEIsv and maps them to their temporary identifiers that
have been traced in the S1-MME interface.
The SGSN-MME performs the Cell Trace Mapping of Permanent
and temporary IDs.
This information can be streamed to the OSS or some other
management system for post-processing.
It is recommended to stream the events in real time to a postprocessing system instead of logging to file. As logging to file
may result in heavy load on SGSN-MME.
The amount of Cell Trace Mapping events that are logged,
depends on the amount of signaling traffic. This determines
the size of the generated log file and the transfer rate of the
event data stream.

Cell Trace Mapping Overview

Cell Trace Log Event Parameters

UE Trace
UE Tracer provides detailed information at call level about the
selected UE.
UE Tracer logs NAS signaling messages sent on the S1-MME
interface over S1-AP
Supports simultaneous tracing of NAS messages for a maximum of
256 sets of UE
Using the trace information, it is possible to perform the following
actions on the operator networks:
Network troubleshooting
Network analysis and optimization
Take corrective or preventive actions based on accurate and detailed
information.

Generates a log file in the eXtensible Markup Language (XML)


format that can be retrieved by an external system for postprocessing.

UE Trace Session
The time interval between activation and
deactivation of UE Tracer is called a trace session.
When the UE is in an active mode and there is
signaling activity between the UE and the node,
logging starts.
Logging stops when the UE is in an idle mode.
The time interval when signaling is logged, is called
a trace recording session.
There may be several trace recording sessions
within a trace session depending on UE activity
See example on next slide

UE Trace Session

UE Cli Commands
create_ue_trace
delete_ue_trace
get_ue_trace
list_ue_trace
modify_ue_trace

UE Cli Example 1
This example configures a UE Trace session to be
initiated in the eNodeB, using IMSI 012345 as UeId.
create_ue_trace -id 012345 -ref 0099009900990099
-ent enodeb

UE Cli Example 2
This example displays the parameters of a specified
UE Trace session. The value of the cause parameter
indicates that the trace was not started successfully
in the eNodeB at the latest initiation from the MME.
get_ue_trace -id 012345
Output

Parameter Active Data Planned Data


-----------------------------------------------------------timestamp
20081212134335 _
planState
__
type
imsi _
ref
0099009900990099 _
depth maximum
_
ifl ALL
_
ip NULL
_
imsi
123456789012345 _
imei
123456789012345 _
isv
1234567890123456 _
sti
1-400 _
ent
enodeb _
tfs
2008-12-11,19:45:00 _
cause
not-enough-user-plane-processing-resources _

Integrated Traffic Capture Overview


Integrated Traffic Capture (ITC) is a built-in traffic capture tool.
ITC is used to capture subscribers payload data.
ITC captures the payload data that is being transferred by
subscribers, as opposed to the Subscriber Event Recording tool,
which captures the signaling sent by subscribers.

ITC can be used on the Gb interface, GTP-U, GTP-C and SCTP


protocols.
Data stored by ITC is saved in PCAP format, thus it can be
viewed and analyzed using commonly available IP protocol
analysis tools such as tcpdump and Wireshark.
For LTE, there is a special Ericsson developed Wireshark
including decoding of some EPS protocols not found or not
complete in the official release of Wireshark.

Where the Packets are Captured


Gb over Frame Relay

Processor
hop

GTU
Device

GnR

MS
Device

FR
Device

BVC
Device
Processor
hop

Gb ITC
Capture Function

Where the Packets are Captured


Gb over IP

Processor
hop

GTU
Device

GnR

MS
Device

BVC
Device

GbR
Processor
hop

Gb ITC
Capture Function

Where the Packets are Captured


Gn/Iu-U

Processor
hop

IuR

GTU
Device

Processor
hop

GnR

GTP-U ITC Capture Function

ITC GTP-U: Type of problems


GTP-U Path Failures (GTP-ECHO)
PDP Cxt Deactivations due to Error Indications.
Corrupt packages
QoS Policing problems.
Feature Test: 3GDT, less IP fragmentations.
EndUser performance: TCP resending, TCP
roundtrip times, etc.

RNC

SGSN-MME

GGSN

ITC Gb: Type of problems


SGSN-MME and BSC interactions.
- Packet Flow Contexts
- BVC (cells) establishments
- Flow Control : BVC, MS and PFC.
- NS ALIVE
- Gb SNS procedures
SGSN-MME and MS interactions.
- Resendings
- Faulty messages
- Packet loss
BSC

SGSN-MME

ITC GTP-C: Type of problems


GTP-C Path Failures (GTP-ECHO)
Signaling problems between GSNs.
- Lost PDP Contexts due to GGSN initiated Delete
PDP Context Request.
- Failed Update PDP Context Request.
- Failed Create PDP Context Request
- Failed Inter SGSN-MME Routing Area Updates
Feature verification: 3GDT, HomeZone charging,
etc.

ITC SCTP: Type of problems


SCTP associations problems.
Problems on RANAP level between SGSN-MME and RNC.
- IuC Handling
- Security Commands
- RAB Handling
- Paging
Problems on RIL3 level between SGSN-MME and UEs.
- Faulty messages
- Signaling problems
SGSN-MME and Node-B integration.

Characteristics
10 MB capture buffer (RAM) per DP and interface.
Circular capture buffers.
Licensed feature.
Capture and Filters survive Small Restart and Large
Restart.

Characteristics File Storage


The capture buffers are stored to files, when ITC is
stopped and saved.
Files will be save in directories:
/tmp/DPE_COMMONLOG/ITC_<INTERFACE>/ITC_<INTERFACE>-<DATE><TIME>

Max number of directories = 3 per interface.


There will be one file per payload DP:
ITC_<INTERFACE>_<hostname>.pcap

Capture Time
The capture all approach is often not the way
forward on SGSN-MMEs with a lot of payload. The
capture buffers will wrap around quite fast.
Suitable filters are necessary to be able to capture
during longer time frames.
Snap length parameter can be used to increase
capture time.

Example Capture Times, GTP-U


In these examples, it is assumed that a snap length
of 100 bytes is used and that a single subscriber is
traced on the DP:
If the traced TCP connection has an average
throughput of 1 Mbps, the capture buffer wraps after
approximately 13 minutes.
If the traced TCP connection has an average
throughput of 50 kbps, the capture buffer wraps
after approximately 4 hours.

Gb Filter Options
GbFR can be included/excluded.
GbIP can be include/excluded.
Filter can either be include or exclude. Default is
include everything. This settings is independent of
the GbFR and GbIP settings.
NS-PDU types can be used in filter.
BSSGP-PDU types can be used in filter.
List with NS-PDU types and list with BSSGP-PDU
types can be used simultaneously.

Gb Filter Capacity

NSEI

NSEI

Max = 32

NSVCI

Max = 32

BVCI

Max = 32

Pair (Cell)

The NSEI, NSVCI and NSEI,BVCI lists are mutually exclusive. Hence,
only one list type can be specified at a time.
Default snap length = 250 octets

GTP-U Filter
Gn and/or Iu-U
GTP-ECHO
Per subscriber identified by IMSI. Max 32.
The default snap length is 100 octets.

Work Flow
Create Capture Filter - CLI
Start Trace - CLI
Status - CLI

Stop Trace - CLI


Save Capture Files - CLI

Get Filter - CLI

Delete Capture Filter - CLI


Transfer Capture Files
Merge of Capture Files (optional)
Analysis of Capture Files

Gb CLI Commands
create_itc_filter_gb
start_itc_gb
get_itc_status_gb
save_itc_file_gb
stop_itc_gb
save_itc_file_gb
get_itc_filter_gb
delete_itc_filter_gb

GTP-U CLI Commands


create_itc_filter_gtpu
start_itc_gtpu
get_itc_status_gtpu
stop_itc_gtpu
save_itc_file_gtpu
get_itc_filter_gtpu
delete_itc_filter_gtpu

ITC Status Command


gsh get_itc_status_gtpu
Equipment
Status
Captured Bytes Buffer Wrapped
-----------------------------------------------------------eqm02s10p2
started
0
false
eqm02s11p2
started
0
false
eqm02s0ap2
started
2430
false
eqm02s0dp2
started
0
false
eqm02s03p2
started
0
false
eqm02s04p2
started
0
false
eqm02s08p2
started
0
false
eqm02s02p2
started
0
false
eqm02s05p2
started
0
false
eqm02s07p2
started
0
false
eqm02s06p2
started
0
false
eqm02s09p2
started
0
false

List PDU Types on Node


cgl -pdu_types
PDU types for BSSGP layer (3GPP TS 08.18 v8.6.0)
--0x00
DL-UNITDATA
0x01
UL-UNITDATA
0x02
RA-CAPABILITY
0x03
PTM-UNITDATA
0x06
PAGING PS
0x07
PAGING CS
...
PDU types for NS layer (ETSI TS 08.16 v8.0.0)
--0x00
NS-UNITDATA
0x02
NS-RESET
0x03
NS-RESET-ACK
0x04
NS-BLOCK

Utilities
Merge PCAP Files
Mergecap is part of the Wireshark installation.
mergecap -w ITC_merged.pcap *

Disable Chipering for Subscriber


When troubleshooting specific subscriber, disable
ciphering. Only applicable for ITC Gb.
gsh add_ms_noclist -imsi 240900003000000

Filtering in WireShark
Messages can be filtered out in Wireshark. To remove
SCTP Heartbeat and SCTP Heartbeat Ack:
(!(sctp.chunk_type == 5)) && !(sctp.chunk_type == 4)

GTP-C/SCTP Commands
create_itc_job
jn ItcJobName
create_itc_filter_gtpc
jn ItcJobName
nw IpNetworkName
[ rip ItcRemoteIpAddress
mask ItcRemoteIpMask
sl ItcSnapLength ]
create_itc_filter_ip
jn ItcJobName
nw IpNetworkName
proto ItcIpProtocol
[ rip ItcRemoteIpAddress
mask ItcRemoteIpMask
sl ItcSnapLength ]

Action Commands
action_itc_job_start
jn ItcJobName
action_itc_job_stop
jn ItcJobName
action_itc_job_save
jn ItcJobName

Show Command
show_itc_job_capture_status
jn ItcJobName
A
A
A
A
A
A
A
A

capture_status
capture_status
capture_status
capture_status
capture_status
capture_status
capture_status
capture_status

-jn
-jn
-jn
-jn
-jn
-jn
-jn
-jn

itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp
itc_all_gtpc_sctp

-eqp
-eqp
-eqp
-eqp
-eqp
-eqp
-eqp
-eqp

2.11
2.11
2.12
2.12
2.14
2.14
2.15
2.15

-bn
-bn
-bn
-bn
-bn
-bn
-bn
-bn

gtpc
ip
gtpc
ip
gtpc
ip
gtpc
ip

|
|
|
|

| stopped yes 8492106


stopped yes 9582520
| stopped yes 8703472
stopped yes 9083344
| stopped yes 8858348
stopped yes 10111732
| stopped yes 8425324
stopped yes 9054620

List Commands
list_itc_job
A itc_job -jn itc_job1
A itc_job -jn itc_job2
list_itc_filter_gtpc
A itc_filter_gtpc -jn itc_job1 -nw Gn
A itc_filter_gtpc -jn itc_job2 -nw Gn
list_itc_filter_ip
A itc_filter_ip -jn itc_job1 -nw SS7-Iu-1
A itc_filter_ip -jn itc_job1 -nw SS7-Iu-2
A itc_filter_ip -jn itc_job2 -nw SS7-Iu-1
A itc_filter_ip -jn itc_job2 -nw SS7-Iu-2

Get Commands
get_itc_job
-jn ItcJobName
timestamp
planState
js
path

20080511142950
_
_
_
saved
_
/tmp/DPE_COMMONL
OG/ITC_itc_job1/
_

get_itc_filter_gtpc
-jn ItcJobName
-nw IpNetworkName
timestamp
planState
rip
mask
sl

20080511200919
_
10.10.10.1
_
255.255.255.255
65535
_

_
_
_

Get Commands
get_itc_filter_ip
-jn ItcJobName
-nw IpNetworkName
timestamp
planState
proto
rip
mask
sl

20080511200919
_
_
sctp
_
20.20.20.1
_
255.255.255.255
_
65535
_

Modify Commands
modify_itc_filter_gtpc
-jn ItcJobName
-nw IpNetworkName
[-rip ItcRemoteIpAddress]
[-mask ItcRemoteIpMask]
[-sl ItcSnapLength]
modify_itc_filter_ip
-jn ItcJobName
-nw IpNetworkName
[-proto ItcIpProtocol]
[-rip ItcRemoteIpAddress]
[-mask ItcRemoteIpMask]
[-sl ItcSnapLength]

Delete Commands
delete_itc_filter_gtpc
jn ItcJobName
nw IpNetworkName
delete_itc_filter_ip
jn ItcJobName
nw IpNetworkName
delete_itc_job
jn ItcJobName

GTP-U Filter Example


Capture traffic for specific subscriber on Gn
interface.
Example Command:
gsh create_itc_filter_gtpu -gn true -iuu false -gtpecho
false -imsi 240900000000000

Gb Filter Example 1
Purpose: Troubleshoot GbIP connectivity related
problem.
Method: Include NS PDUs for a specific BSC
NS PDU-Types:
10
11

NS-ALIVE
NS-ALIVE-ACK

Example Command
gsh create_itc_filter_gb

-include true -nspdu 10 11 -nsei 500

Gb Filter Example 2
Purpose: Troubleshoot GbIP SNS-related problem.
Method: Include NS PDUs for a specific BSC
NS PDU-Types:
12
13
14
17

SNS-ACK
SNS_ADD
SNS_CHANGEWEIGHT
SNS-DELETE

Example Command:
gsh create_itc_filter_gb

-include true -nspdu 12 13 14 17 nsei 500

Gb Filter Example 3
Purpose: Troubleshoot a cell-related problem.
Method: Include BSSGP PDUs for a specific BSC
BSSGP PDU-Types
34
35
38
39

BVC-RESET
BVC-RESET-ACK
FLOW-CONTROL-BVC
FLOW-CONTROL-BVC-ACK

Example Command:
gsh create_itc_filter_gb

-include true -bssgppdu 34 35 38 39 nsei 500

Gb Filter Example 4
Purpose: Troubleshoot Packet Flow Context problem.
Method: Include BSSGP PDUs for a specific BSC
BSSGP PDU-Types:
80
81
82
83
84
85
86
87

DOWNLOAD-BSS-PFC
CREATE-BSS-PFC
CREATE-BSS-PFC-ACK
CREATE-BSS-PFC-NACK
MODIFY-BSS-PFC
MODIFY-BSS-PFC-ACK
DELETE-BSS-PFC
DELETE-BSS-PFC-ACK

Example Command:
gsh create_itc_filter_gb
-nsei 503

-include true -bssgppdu 80 81 82 83 84 85 86 87

Gb Filter Example 5
Purpose: Troubleshoot flow-control problem.
Method: Include BSSGP PDUs for several BSCs
BSSGP PDU-Types:
38
39
40
41

FLOW-CONTROL-BVC
FLOW-CONTROL-BVC-ACK
FLOW-CONTROL-MS
FLOW-CONTROL-MS-ACK

Example Command:
gsh create_itc_filter_gb -include true -bssgppdu 38 39 40 41
-nsei 500 501 502 503 504

Gb Filter Example 6
Purpose: Capture all traffic on NSEI.
Method: Include all PDUs for a specific BSC
Example Command
gsh create_itc_filter_gb

-include true -nsei 500

Specific mobiles can be filtered out by TLLI or IMSI


in Wireshark/Ethereal with the following filters:
bssgp.tlli==0x796121ec
bssgp.imsi==240900003000000

Gb Filter Example 7
Purpose: Capture all traffic on NSVCIs.
Method: Include all PDUs for specific NSVCIs
Example command:
gsh create_itc_filter_gb

-include true -nsvci 100 110

Gb Filter Example 8
Purpose: Capture all traffic on NSEI,BVCI pairs
(Cells).
Method: Include all PDUs for specific NSEI/BVCI
pairs.
Example Command:
gsh create_itc_filter_gb
-nsei 500 -bvci 1000 } {
-nsei 500 -bvci 1002 } {
-nsei 500 -bvci 1004 } {

-include true -nseibvci


-nsei 500 -bvci 1001 }
-nsei 500 -bvci 1003 }
-nsei 500 -bvci 1005 }

{
{
{

Gb Filter Example 9
Purpose: Capture all traffic except NS-ALIVE,
NS-ALIVE-ACK, FLOW-CONTROL-BVC, FLOWCONTROL-BVC-ACK .
Method: Exclude NS and BSSGP PDUs
Example Command:
gsh create_itc_filter_gb
-bssgppdu 40 41

-include false -nspdu 10 11

Gb Filter Example 10
Purpose: Capture traffic on all NSEIs except:
500, 500, 501, 502, 503, 504.
Method: Exclude all traffic on specific BSCs
Example Command:
gsh create_itc_filter_gb
501 502 503 504

-include false -nsei 500

GTP-C Example
Capture all GTP-C traffic on Gn network.
create_itc_filter_gtpc -jn itc_job1 -nw Gn
Capture GTP-C traffic on Gn network. Filter on remote IP addresses.
gsh create_itc_filter_gtpc -jn itc_job2 -nw Gn -rip 10.10.10.1 -mask
255.255.255.0

SCTP Example
Capture all SCTP traffic on SS7-Iu-1 and SS7-Iu-2 networks.
gsh create_itc_filter_ip -jn itc_job3 -nw SS7-Iu-1 -proto sctp
gsh create_itc_filter_ip -jn itc_job3 -nw SS7-Iu-2 -proto sctp
Capture SCTP traffic on SS7-Iu-1 and SS7-Iu-2 networks. Filter on
remote IP addresses.
gsh create_itc_filter_ip -jn itc_job4 -nw SS7-Iu-1 -proto sctp -rip
20.20.20.1 -mask 255.255.255.255
gsh create_itc_filter_ip -jn itc_job4 -nw SS7-Iu-2 -proto sctp -rip
30.30.30.1 -mask 255.255.255.255

GTP-C + SCTP Example


Use both GTP-C and SCTP filters in the same ITC_job.
create_itc_filter_gtpc -jn itc_job5 -nw Gn
gsh create_itc_filter_ip -jn itc_job5 -nw SS7-Iu-1 -proto sctp
gsh create_itc_filter_ip -jn itc_job5 -nw SS7-Iu-2 -proto sctp

SGSN-MMME
Troubleshooting
Mobility and Session Management (GSM
and WCDMA)

Objectives
Upon completion of this chapter the student will be
able to:
Identify Mobility and Session Management Faults
Trace and log mobility and session events with the
use of SGSN-MME tools
Identify different reasons for attach and PDP
failures.
Analyse Cause Codes for problem resolution

Attach Failure Flowchart

Troubleshooting Instructions for Attach


Failure
Attach to the GPRS network. Use a
protocol analyzer and a MS. Also check
for information and cause code 17 in the log.

If

In the log, check for a


response from the SGSNMME on
the attach request.

e
er
h
t

sa

re

se
n
o
sp

If there is no response

If the attach request is rejected,


examine the cause code with
which the request is rejected. Use
the protocol analyzer to retrieve
the cause code.

proceed with troubleshooting


interfaces, For Gb over Frame
Relay and Gb over IP

High Attach Failure Rate Flowchart


1/2

Go to next slide.

High Attach Failure Rate Flowchart


2/2

MM Alarms
admAttachCapacityReached
admAttachLicenseApproaching
admAttachHardLicenseExceeded
admAttachSoftLicenseExceeded
nwcCoopRaExist

Event Recording
The event_rec_session is used to record specified events
during a subscribers session. The following events can
be recorded:
GPRS Attach: Attach Completed and Attach Reject
GPRS Detach: Detach, MS Not Reachable and MS
Leaves Node
Cell Update: Cell Update
Routing Area Update: Routing Area Update
Completed, Routing Area Update Reject and MS Activity
Service Request: Service Request and Service Reject
SW Error: Connection Restart and MS GMM Status

Create Event Recording Session


To create an event recording session we use the
following command:
create_event_rec_session
To delete an event recording session for a
subscriber we use the following command:
delete_event_rec_session -imsi 05345671121
We also have the option to get, set and list event
record sessions.

Mobility Event Log (GSM)


Time : 2010-11-15 13:55:05
Node : e_Erlang__Global_pm1_17_2_1@eqm01s11p2
GMM Cause : Network Failure (#17)
Details : Unexpected response from external node
Attach : ptmsi_type, gprs_attach
IMSI : N/A
PTMSI : 3839955752
RA New : 12302100022005
RA Old : 12302100022005
Cell ID : 5039
HLR addr : 0017404699998034145390

Mobility Event Log (WCDMA)


Time : 2010-11-15 13:55:05
Node : e_Erlang__Global_pm1_16_2_1@eqm01s10p2
GMM Cause : Network Failure (#17)
Details : Authentication failure
Attach : ptmsi_type, gprs_attach
IMSI : N/A
PTMSI : 3870608502
RA New : 12302101000035
RA Old : 12302101000034
HLR addr : 0017404699998004052600

Subscriber Details
See Chapter 4 - Subscriber Tracing for detailed explanation
on tracing tools
list_subscribers [[-imsi ImsiPfx | -msisdn MsisdnPfx |
-imei
get_subscriber [-dl DetailLevel] -imsi Imsi | -msisdn
Msisdn | -imei Imei | -ptmsi Ptmsi | -tlli Tlli
ECi Tool
Integrated Traffic Capture (ITC)
Event Based Monitoring (EBM)

MM Cause Codes 1/7


Decimal number

Name

Description

Action

IMSI unknown in HLR

Only GSM:
The MSV/VLR rejects a
Combined PS/CS Attach
procedure. The MSC/VLR rejects
due to non CS subscription.

Only GSM:
Check the CS subscription.

Illegal MS

Only GSM:
Occurs when the MS requests a
Combined CS/PS Attach or RA
Update and the location updating
towards MSC/VLR is rejected.

Only GSM:
Check why MSC/VLR treats the
MS as illegal.

Illegal ME

Check IMEI towards EIR results


in a blacklisted Mobile
Equipment (ME).

Check with operator why the ME


is blacklisted.

General Packet Radio Service


(GPRS) services not allowed

There is no GPRS subscription in


the Home Location Register
(HLR) for this particular IMSI.

Check the configuration in the


HLR.

MM Cause Codes 2/7


Decimal number

Name

Description

Action

GPRS services and non-GPRS


services not allowed

The IMSI is unknown in the HLR.

Check the configuration in the


HLR. Also, check the IMSI
number series and Global Title
(GT) rule configuration in the
SGSN-MME.

MS identity cannot be derived by


the network

The SGSN-MME verifies that the


old RAI is not defined as
Cooperating Routing Area (RA).
SGSN-MME cannot derive the
MSs identity from the P-TMSI in
case of inter-SGSN-MME RA
update.

Check if the RAI of old SGSNMME will be configured as


Cooperating RA.

10

Implicitly detached

The MS must reattach and usually


takes place when:
An unattached MS sends a RAU
request to the SGSN-MME.
An UL payload is received for an
unattached MS.

None

MM Cause Codes 3/7


Decimal number

Name

Description

Action

11

PLMN not allowed

The SGSN-MME is configured


with Roaming Restrictions.

Check the Roaming Restrictions


configuration in the SGSNMME, if this IMSI should be
able to roam in this location
area.

12

Location area not allowed

Only WCDMA:
The MS requests a Combined
CS/PS RA Update and the
location updating towards
MSC/VLR is rejected with in a
location area where the MS, by
subscription, is not allowed to
operate.

Only WCDMA:
Check if the MS is allowed to
operate CS services in that
location area.

13

Roaming not allowed in this


Location Area

The SGSN-MME is configured


with Roaming Restrictions.

Check the Roaming Restrictions


configuration in the SGSNMME, if this IMSI should be
able to roam in this location
area.

MM Cause Codes 4/7


Decimal number

Name

Description

Action

14

GPRS services not allowed in this


PLMN

There is no IMSI series


configured in the SGSN-MME
that matches this subscriber's
IMSI.

Check the IMSI series


configuration in the SGSNMME.

The HLR returns "Roaming not


allowed" in Update GPRS
Location Response.

Check the configuration in the


HLR.

The MS has indicated that


ciphering is not supported, but
the Gb_UncipheredMode node
property does not allow an
unciphered connection.

Change the Gb_UncipheredMode


node property, if unciphered
connections shall be allowed.

The SGSN-MME is configured


with Roaming Restrictions.

Check the Roaming Restrictions


configuration in the SGSNMME, if this IMSI should be
able to roam in this location
area.

MM Cause Codes 5/7


Decimal number

Name

Description

Action

15

No suitable cells in location area

The SGSN-MME is configured


with Roaming Restrictions.

Check the Roaming Restrictions


configuration in the SGSNMME, if this IMSI should be
able to roam in this location
area.

The MS is not allowed to attach in


the current Location Area (LA).

Check the configuration of the


LA/Routing Area (RA).

Only GSM:
The MS requests a Combined
CS/PS RA Update and the
location updating towards
MSC/VLR is rejected.

Only GSM:
Check if MSC/VLR is down.

The HLR does not respond to the


SGSN-MME messages, or the
SGSN-MME cannot send
messages to the HLR.

Check the Gr interface. Also, check


the IMSI number series, GT
rule, and SS7 routing
configuration.

16

MSC temporarily not reachable

17

Network failure

MM Cause Codes 6/7


Decimal number

22

Name

Description

Action

The capacity license SAU Attach


Limit has been reached.

See the mobility event logs for


more information on this cause
code.

Congestion
Check the capacity license SAU
Attach Limit, and compare with
the number of attached
subscribers.

95

Semantically incorrect
message

The system defined SAU hard


limit of the SGSN-MME has been
reached.

Check system defined SAU hard


limit and compare with the
number of attached subscribers.

The processing load on the SGSNMME is too high.

Investigate traffic load and check


if Central Processing Unit (CPU)demanding features are turned on.

The SGSN-MME regards the


Attach Request message as
incorrect.

Use a protocol analyzer to look for


protocol errors in the message sent
by the MS.

MM Cause Codes 7/7


Decimal number

Name

Description

Action

96 (1)

Invalid mandatory information

See cause code 95.

See cause code 95.

97

Message type non-existent or not


implemented

See cause code 95.

See cause code 95.

99

Information element non-existent


or not implemented

See cause code 95.

See cause code 95.

100

Conditional IE error

See cause code 95.

See cause code 95.

111

Protocol error, unspecified

Only WCDMA:
This occurs when the Radio
Network Controller (RNC)
sends "Security Mode Reject"
to the SGSN-MME as an
answer from "Security Mode
Command".

Only WCDMA:
Check the Iu-C interface.
Check the RNC configuration in
the SGSN-MME.

PDP Context Activation Failure


Flowchart

In most cases, the top 2 PDP


context activation reject codes
are CC33 and CC27.

Session Management Alarms


admContextCapacityReached
admContextLicenseApproaching
admContextHardLicenseExceeded
admContextSoftLicenseExceeded
gtpGgsnBlacklisted

Session Event on SGSN-MME


All MS-initiated activate PDP context rejects due to missing
or unknown APN, unknown PDP address, requested service
option not subscribed, or network failure are stored in
Session Event Log log files.
The session event logs will collect reject cause code #27,
#28, #33(with optional feature: Misconfigured MT
Identification), #38.
From the session event log we can analyze the failure
reason and subscribers behavior.

Example: Session Event Log with CC27


Missing or Unknown APN
===== SESSION EVENT (W): MS INITIATED ACTIVATE
REQUEST=====
Time : 2010-10-12 18:49:32
Node : e_Erlang__Global_pm1_18_2_1@eqm01s12p2
IMSI : 240990605007129
SM Cause : Missing or unknown APN (#27)
MSISDN : 99945600102
Details : Missing or unknown APN (#219)
GGSN Addr.: 10.16.102.129
APN Req. : ttcn129.com
APN Sub. : ttcn129.com
APN Used : ttnc129.com.mnc099.mcc240.gprs

Example: Session Event Log with CC28


Unknown PDP address or PDP type
=====SESSION EVENT (W): MS INITIATED ACTIVATE
REQUEST======
Time : 2010-10-12 18:49:34
Node : e_Erlang__Global_pm1_16_2_1@eqm01s10p2
IMSI : 240990605007130
MSISDN : 99945600102
SM Cause : Unknown PDP address or PDP type (#28)
Details : Unknown PDP address or PDP type (#220)
GGSN Addr.: 10.16.102.129
PDP Type : IETF IPv4
PDP Addr. : (dynamic)

Solution for CC 27
We can use APN Redirection feature to resolve this issue
caused by subscriber.

SGSN-MME

Configuration For APN Redirection


Step1: Activate the APN Redirection feature
- gsh modify_feature -name apn_redirection -state on

Step2: Configure default APN for GSM network or


UMTS seperately
- gsh set_nodeprop Gn_DefaultAPNGSMNetwork -val eetest
- gsh set_nodeprop Gn_DefaultAPNUMTSNetwork -val eetest

Configuration For APN Redirection (cont.)


1 MS sent an Activate PDP context
request to SGSN-MME without
APN, normally, the SGSN-MME will
send Activate PDP context reject
message with CC27 to MS if APN
Redirection is disabled.

2 SGSN-MME initiate a Create


PDP context request message to
GGSN to continue the session.

SM Cause Codes 1/5


Decimal number

Name

Description

Action

25

LLCorSNDCPfailure

IndicatesthataPDPcontextis
deactivatedbecauseofaLLCor
SNDCPfailure.Forexampleifthe
SMreceivesaSNSM-STATUS
requestmessagewithcause"DM
received"orinvalidXID
response".

Useaprotocolanalyzertolook
forprotocolerrorsinthe
messagesentbytheMS.

26

Insufficientresources

ThecapacitylicensePDPContext
Limithasbeenreached.

CheckthecapacitylicensePDP
ContextLimit,andcompare
withthenumberofactivated
PDPContexts.

ThesystemdefinedPDPContext
hardlimithasbeenreached.

ChecksystemdefinedPDP
Contexthardlimitand
comparewiththenumberof
activatedPDPContexts.

AlldynamicIPaddressesinGGSN
areoccupied.

ChecktheGGSNstatus.

Only SGSN-MME (W):


RadioAccessBearer(RAB)
Assignmentisrejectedbythe
RNC.

Only SGSN-MME (W):


ChecktheRNCstatus.

SM Cause Codes 2/5


Decimal number

Name

Description

Action

27

MissingorunknownAPN

TheAPNisnotincludedintheDNS.

Checktheconfigurationinthe
DNS.

NoresponsefromtheDNS.

Checktheconfigurationinthe
SGSN-MMEandthe
configurationoftheinterface
onwhichDNSisused.

Seethesessioneventlogsfor
informationaboutthiscause
code.

28

UnknownPDPaddressorPDP
type

Indicatesthattherequestedservice
wasrejectedbytheexternalPacket
DataNetwork(PDN),becausethe
PDPaddressortypecouldnotbe
recognized.

Seethesessioneventlogsfor
informationaboutthiscause
code.

29

Userauthenticationfailed

Indicatesthattherequestedservice
wasrejectedbytheexternalPDN
duetoafaileduserauthentication.

CheckthattheMSsendsvalid
ProtocolConfigurationOptions
intheActivatePDPContext
Requestmessage.
Checktheconfigurationof
RADIUS/DynamicHost
ConfigurationProtocol(DHCP)
serversintheexternalPDN.

SM Cause Codes 3/5


Decimal number

Name

Description

Action

30

ActivationrejectedbyGGSN

SettingupasecondaryPDPcontext
whentheprimarycontextissetup
usingGTPv0.

SecondaryPDPcontextisnot
supportedifGTPv0isused.

31

Activationrejected,
unspecified

Mostprobablereason,theattach
procedurewasunsuccessful.

Troubleshoottheattach
sequence.

32

Serviceoptionnotsupported

Mostprobablereason,ActivePDP
ContextRequestrequestsanonsupportedPDPtype.

CheckthePDPtypeinthe
ActivatePDPContextRequest
message.

33

Requestedserviceoptionnot
subscribed

Activationdeniedsincetherequested
valuessentinActivePDPContext
Requestdoesnotmatchvalues
storedintheHLR.
Example:AMSrequestsastaticIP
addressbutthesubscriptionisfor
dynamic.

ChecksubscriberdataintheHLR
andrequestedvaluessentin
ActivePDPContextRequest

IfMisconfigured MT Identification
isactivated:
TheMSisloggedinthesessionevent
log.

IfMisconfigured MT
Identification isactivated:
Checkthesessioneventlogto
identifytheMSsusing
incorrectinformationwhen
requestingaPDPcontext
activation.

SM Cause Codes 4/5


Decimal number

Name

Description

Action

36

RegularPDPcontext
deactivation

IndicatesaregularMS-ornetwork-
initiatedPDPcontextdeactivation.

Noaction.

38

Networkfailure

NoCreatePDPContextResponseis
receivedfromtheGGSN.

CheckthestatusoftheGGSN
andtheGninterface.

Only SGSN-MME (W):


NoRABAssignmentResponseis
receivedwithintheTRABAssgt
timeout.

Only SGSN-MME (W):


CheckthestatusoftheRNC.If
RABAssignmentissentfrom
theRNC,checkorincrease
thenodeproperty
Iu_TRABassgt.

Seethesessioneventlogsfor
informationaboutthiscause
code.

TheGGSNregardstheTFTandIP
PacketFiltersasincorrect.

Useaprotocolanalyzertolook
forprotocolerrorsinthe
messagesentbytheMSand
theSGSN-MMEtotheGGSN.

41,42,44,45,46

TFTandIPPacketFilter
errors

SM Cause Codes 5/5


Decimal number

Name

Description

Action

43

UnknownPDPcontext

TheprimaryPDPcontextisnotactive
whentryingtoactivatea
secondaryPDPcontext.

Troubleshoottheactivationof
theprimaryPDPcontext.

95

Semanticallyincorrect
message

TheSGSN-MMEregardstheAttach
Requestmessageasincorrect.

Useaprotocolanalyzertolook
forprotocolerrorsinthe
messagesentbytheMS.

96

Invalidmandatory
information

Seecausecode95.

Seecausecode95.

97

Messagetypenon-existentor
notimplemented

Seecausecode95.

Seecausecode95.

99

Informationelementnonexistentornot
implemented

Seecausecode95.

Seecausecode95.

100

ConditionalIEerror

Seecausecode95.

Seecausecode95.

111

Protocolerror,unspecified

Only SGSN-MME (W):


ThisoccurswhentheRNCsendsa
SecurityModeRejectmessageto
theSGSN-MMEasananswerfrom
SecurityModeCommand.

Only SGSN-MME (W):


ChecktheIuinterface.
ChecktheRNCconfiguration.

Seecausecode95.

Seecausecode95.

SGSN-MME Configuration Issue 1/4


Case1: Missing or Incorrect IMSINS Configuration
- delete_imsins -imsi ImsiNumberSeries
- create_imsins -imsi xxxxx

SGSN-MME Configuration Issue 2/4


Case2: Missing or incorrect Gn or Gom interface
Configuration
1. PDP context activate request

5. Create PDP context request


6. Create PDP context response

7. PDP context activate accept

SGSN-MME

GGSN

2
.d

BSC

APN
GTP-C

MS
LLC connection

GTP-U

Either step 4 or 6 failed


which will sent response to
MS with SM cause code 38
Network Failure

IP
IP Network
Gn or Gp
Interface
3. DNS Query (APN)

DNS

4. DNS Query Reponse

SGSN-MME Configuration Issue 3/4


Case3: Capacity License Limit
When attach limit reached in SGSN-MME , SGSN-MME
will response MS Attach Reject with MM CC22
Congestion
- gsh set_nodeprop "attach_limit" 100(Default is 100K)

SGSN-MME Configuration Issue 4/4


Case4: Capacity License Limit
When context limit reached in SGSN-MME , SGSN-MME
will response MS Activate PDP Context Reject
message with SM CC26 Insufficient resources
- gsh set_nodeprop "context_limit" 5(Default is 5K)

SGSN-MMME
Troubleshooting
Mobility and Session Management for LTE

Objectives
Upon completion of this chapter the student will be
able to:
Identify Mobility and Session Management Faults in
the Evolved Packet System (EPS)
Trace and log mobility and session events with the
use of SGSN-MME tools
Identify different reasons for attach and PDN
Connection failures.
Analyse Cause Codes for problem resolution

Attach Failure

High Attach Failure Rate Flowchart


1/2

Go to next slide.

High Attach Failure Rate Flowchart


2/2

mobility_event_log
All attach reject messages that occur due to
network failure, GPRS mobility management or EPS
Mobility Management cause code #17 are stored in
this log.
The maximum log file size is 1 Mb and contains a
maximum index log of 255. The wrap time is 2
hours and this log file is deleted after 5 days.
Mobility event log file is stored in the
/tmp/OMS_LOGS/mobility_event_log/ready folder.

mobility_event_log printout for EPS

The following is an example of a mobility_event_log file printout


(for EPS).
======== MOBILITY EVENT (E): ATTACH REJECT =========
Time : 2010-06-17 11:03:29
Node : e_Erlang__Global_pm1_2_2_1@selnc497
EMM Cause : Network Failure (#17)
Details : Timeout when communicating with external node
Attach : Guti Type, Initial Attach
IMSI : 12345600100
MTMSI : 3237579264
TA New : 123-456-12
TA Old : 000-00-0
HSS addr : hss1.ericsson.com

Subscriber Details
See Chapter 4 - Subscriber Tracing for detailed explanation
on tracing tools
list_subscribers [[-imsi ImsiPfx | -msisdn MsisdnPfx |
-imei
get_subscriber [-dl DetailLevel] -imsi Imsi | -msisdn
Msisdn | -imei Imei | -ptmsi Ptmsi | -tlli Tlli
ECi Tool
Integrated Traffic Capture (ITC)
Event Based Monitoring (EBM)

EPS MM Cause Codes 1/5


Decimal number

Name

Description

Action

EPSservicesnot
allowed

All3GPPdefinedRadio
AccessTechnology
(RAT)types,thatis
GERAN,UTRAN,
GAN,I-HSPA-E,and
E-UTRAN,are
restrictedforthis
IMSI.

Checktheaccess
restriction
configurationinthe
HSS

EPSservicesandnonEPSservicesnot
allowed

TheIMSIisunknownin Checktheconfiguration
thehomenetwork.
intheHSS.

UEidentitycannotbe
derivedbythe
network

Thenetworkfailedto
validatetheidentity
oftheUEduetoan
integritycheck
failureofthe
receivedmessage.

None

10

Implicitlydetached

TheUEmustreattach.
Usuallytakesplace
whenanunattached
UEsendsaTracking
AreaUpdate(TAU)
requesttotheMME.

None

11

PLMNnotallowed

TheMMEisconfigured
withAccess
Restrictions.

ChecktheAccess
Restrictions
configurationinthe
MME,toseeifthis
IMSInumberseries
shallhaveaccess.

EPS MM Cause Codes 2/5


13

Roamingnotallowedin
thistrackingarea

TheMMEisconfiguredwith ChecktheAccess
AccessRestrictions.
Restrictions
configurationinthe
MME,toseeifthisIMSI
numberseriesshall
haveaccess.

14

EPSservicesnotallowedin
thisPLMN

ThereisnoIMSIseries
configuredintheMME
thatmatchesthis
subscriber'sIMSI.

ChecktheIMSIseries
configurationinthe
MME.

TheHSSreturnsRoaming
not allowedinthe
UpdateLocation
Response.

Checktheconfigurationin
theHSS.

Thereisnocommon
Checkthealgorithm
integrityorciphering
configurationinthe
algorithmfortheUEand
MME.
theMME.
15

Nosuitablecellsintracking ThereisnoEPS
Checktheconfigurationin
area
subscriptionfortheIMSI
theHSS.
intheHSS.
TheIMSIisunknowninthe Checktheconfigurationin
HSS.
theHSS.
TheE-UTRANRATtype
whichisusedbytheUE
isnotallowedforthis
IMSI.TheUEmayallow
accessthroughanother
3GPPdefinedRATtype.

Checktheaccess
restrictionconfiguration
intheHSSoraccess
restrictioninMMEtosee
ifthisIMSInumber
seriesshallhaveaccess.

EPS MM Cause Codes 3/5


16

MSCtemporarilynot
reachable.

ThiscauseissenttotheUEif
itrequestsacombined
EPS/IMSIattachora
combinedTA/LAupdateand
theMSCistemporarilynot
reachablefromtheMMEover
theSGsinterface.

None

17

Networkfailure

TheHSSdoesnotrespondto
theMMEmessages,orthe
MMEcannotsendmessages
totheHSS.

ChecktheS6ainterface.

ThisoccurswhentheHSS
duringauthenticationsends
anemptyresponsetothe
MME.

Checktheconfigurationinthe
HSS.
Seethemobilityeventlogs
formoreinformationonthis
causecode.

UnexpectedDiameter
Checktheconfigurationinthe
messagesorunexpected
HSS.
resultcodesarereceivedfrom
theHSS.
18

CSdomainnotavailable

ThisEMMcauseissenttothe
UEiftheMMEcannotservice
anUEgeneratedrequest
becauseofnoavailabilityof
CSdomain.

None

19

ESMfailure

SeeSessionManagement

20

MACfailure

TheUSIMdetectsthatthe
MediaAccessControl(MAC)
intheAuthenticationRequest
messageisnotfresh.

Checktheconfigurationinthe
HSS.

EPS MM Cause Codes 4/5


21

Synchfailure

TheUSIMdetectsthatthe
SequenceNumber(SQN)in
theAuthenticationRequest
messageisoutofrange.

Checktheconfigurationinthe
HSS.

22

Congestion

Indicatescongestioninthe
network.Thecongestion
couldbearesultofthatthere
isnochannelorthatthe
facilityisbusyorcongested.

ChecktheMMEnodecapacity.

23

UEsecuritycapabilities
mismatch

TheUEdetectsthattheUE
securitycapabilitydoesnot
matchtheonesentbackby
thenetwork.

Checkthealgorithm
configurationinUEandthe
MME.

24

Securitymoderejected,
unspecified

TheSecurityModecommand
isrejectedbytheUE.This
canbetheresultofthatthe
temporaryUEindicatedinthe
nonceUEIEdoesnotmatch
theonesentbackbythe
networkoraresultof
unspecifiedreasons.

None

EPS MM Cause Codes 5/5


26

Non-EPSauthentication
unacceptable

Indicatesthattheseparation
bitintheAMFfieldofAUTNis
setto0intheAuthentication
Requestmessage.

Checktheconfigurationinthe
HSS.

95

Semanticallyincorrect
message

TheMMEregardstheNon
AccessStratum(NAS)
messagefromtheUEas
incorrect.

Useaprotocolanalyzerto
lookforprotocolerrorsinthe
messagesentbytheMS.

96

Invalidmandatoryinformation

Seecausecode95.

Seecausecode95.

97

Messagetypenon-existentor
notimplemented

Seecausecode95.

Seecausecode95.

98

Messagenotcompatiblewith
protocolstate

Seecausecode95.

Seecausecode95.

99

Informationelementnonexistentornotimplemented

Seecausecode95.

Seecausecode95.

100

ConditionalIEerror

Seecausecode95.

Seecausecode95.

101

Messagenotcompatiblewith
protocolstate

Amessagehasbeenreceived Seecausecode95.
thatisincompatiblewiththe
protocolstate,oraSTATUS
messagehasbeenreceived
indicatinganincompatiblecall
state.

111

Protocolerror,unspecified

AnoptionalparameterinNAS
isfaulty.

Seecausecode95.

EPS Bearer Activation Fault Flowchart

session_event_log
A session_event_log file is a system-generated file
stored in the /tmp/OMS_LOGS/session_event_log/ready
directory.
The log file can be viewed using UNIX commands like all
other buit-in logs.
The maximum log file size of 1 Mb contains a maximum
index log of 255. The wrap time is two hours, and this
log file is deleted after five days.
UE-initiated activate default bearer contexts that are
rejected due to a missing or unknown APN or network
failure are stored in this log file. See following
examples>

session_event_log Network Failure


#38

The following is an example of session_event_log file printout for


network failure:

=SESSION EVENT (E): ATTACH INITIATED DEFAULT BEARER REQUEST ===


Time : 2010-04-30 11:31:06
Node : e_Erlang__Global_pm1_2_2_1@selnc497
IMSI : 12345600116
MSISDN : 99945600116
SM Cause : Network Failure (#38)
Details : Timeout in SGW
Message : create_session_request
eNodeB Id : 2
PDN Addr. :
APN Used : www.ericsson.com.mnc456.mcc123.gprs
SGW Addr. : 10.0.2.51

session_event_log Unknown APN


#27

The following is an example of session_event_log file


printout for missing and unknown APN:

= SESSION EVENT (E): ATTACH INITIATED DEFAULT BEARER REQUEST =

Time : 2010-04-30 13:23:18


Node : e_Erlang__Global_pm1_3_2_1@selnc497
IMSI : 12345600149
MSISDN : 99945600149
SM Cause : Missing or unknown APN (#27)
Details : Gateway Selection error
eNodeB Id : 1
APN Req. : www.ericsson.com
APN Sub. : www.ericsson.com
APN Used : Undefined

EPS SM Cause Codes 1/6


Decimal number

Name

Description

Action

26

Insufficientresources

Theservicewasrejectedby
theSGWduetocauses
concerningresource,
includingPDNaddressand
memory.

ChecktheSGWstatus.

27

UnknownormissingAPN

Therequestedservicewas
rejectedbytheexternal
PDN,becausetheAccess
PointName(APN)is
missing.

Seethesessioneventlogsfor
informationaboutthis
causecode.

28

UnknownPDNaddressorPDN Therequestedservicewas
type
rejectedbytheexternal
PDN,becausethePDN
addressortypecouldnot
berecognized.

Seethesessioneventlogsfor
informationaboutthis
causecode.

29

Userauthenticationfailed

Isusedbythenetworkto
indicatethattherequested
servicewasrejectedbythe
externalPDNduetoa
faileduserauthentication.

30

RequestrejectedbySGWor
PDNGW(PGW)

Therequestedservice,
operation,ortherequest
foraresourcewasrejected
bytheSGWorPGW.

ChecktheSGWandPGW.

31

Requestrejected,unspecified

Therequestedservicewas
rejectedbytheSGWdue
toGPRSTunnelingProtocol
(GTP)causesthatarenot
coveredbyotherESM
rejectcauses.

Troubleshoottheattach
sequence.

EPS SM Cause Codes 2/6


32

Serviceoptionnotsupported

Isusedbythenetworkwhen
theUErequestsaservice
thatisnotsupportedby
thePLMN.

33

Requestedserviceoptionnot
subscribed

IndicatesthattheUErequests Checktheconfigurationinthe
aserviceoptionforwhich
HSS.
ithasnosubscription.

35

PTIalreadyinuse

ThePTIincludedbytheUEin
theprocedurerequestis
alreadyinuseinanother
activeUE-requested
procedureforthisUE.

36

Regulardeactivation

IndicatesaregularUE-or
None
network-initiatedreleaseof
EPSbearerresources.

38

Networkfailure

NoCreateBearerResponseis
receivedfromtheSGW.

Seethesessioneventlogsfor
informationaboutthis
causecode.

Rejectcausesreceiveddueto
GTPmessageformator
systemfailure.

Seethesessioneventlogsfor
informationaboutthis
causecode.

Therequestedservicewas
rejectedduetoasemantic
errorintheTrafficFlow
Template(TFT)operation
includedintherequest.

Checktheconfigurationinthe
UEorPGW.

41

SemanticerrorintheTFT
operation.

None

EPS SM Cause Codes 3/6


42

SyntacticalerrorintheTFT
operation.

Therequestedservicewas
rejectedduetoa
syntacticalerrorintheTFT
operationincludedinthe
request.

Checktheconfigurationinthe
UEorPGW.

43

InvalidEPSbeareridentity

TheEPSbeareridentityvalue
providedtothenetworkor
UEisnotavalidvaluefor
thereceivedmessage.

None

TheEPSbearercontext
None
identifiedbythelinkedEPS
beareridentityIEinthe
requestisnotactive.
44

Semanticerrorsinpacket
filter(s)

Therequestedservicewas
Checktheconfigurationinthe
rejectedduetooneor
UEorPGW.
moresemanticerrorsin
thepacketfiltersoftheTFT
includedintherequest.

45

Syntacticalerrorinpacket
filter(s)

Therequestedservicewas
rejectedduetooneor
moresyntacticalerrorsin
packetfiltersoftheTFT
includedintherequest.

Checktheconfigurationinthe
UEorPGW

49

LastPDNdisconnectionnot
allowed

TheUE-requestedPDN
Disconnectionprocedureis
notallowedonthelast
remainingPDNconnection.

None

EPS SM Cause Codes 4/6


50

PDNtypeIPv4onlyallowed

Isusedbythenetworktoindicate Checkthesubscriptionin
thatthePDNconnectivity
theHSSorthePGW
requestedbytheUEforboth
configuration.
IPv4andIPv6isacceptedwith
therestrictionthatonlyIPv4is
allowedduetolimitationsinthe
subscriptionorPGW
configuration.
Isusedbythenetworktoindicate Checkthesubscriptionin
thatthePDNconnectivity
theHSSorthePGW
requestedbytheUEforIPv6is
configuration.
rejectedbecauseonlyIPv4is
allowedduetolimitationsinthe
subscriptionorPGW
configuration.

51

PDNtypeIPv6onlyallowed

Isusedbythenetworktoindicate Checkthesubscriptionin
thatthePDNconnectivity
theHSSorthePGW
requestedbytheUEforboth
configuration.
IPv4andIPv6isacceptedwith
therestrictionthatonlyIPv6is
allowedduetolimitationsinthe
subscriptionorPGW
configuration.
Isusedbythenetworktoindicate Checkthesubscriptionin
thatthePDNconnectivity
theHSSorthePGW
requestedbytheUEforIPv4is
configuration.
rejectedbecauseonlyIPv6is
allowedduetolimitationsinthe
subscriptionorPGW
configuration.

EPS SM Cause Codes 5/6


52

Singleaddressbearers
onlyallowed

TherequestedPDNconnectivityisacceptedwiththe
restrictionthatonlysingleIPversionbearersare
allowed.

Checktheconfigurationinthe
HSSorPGW.Checkthe
configurationoftheDual
AddressBearerFlag(DAF)
intheMMEbyusingthe
get_neCLIcommand.

53

ESMinformationnot
received

TheMMErejectstheattachrequestsincenovalidESM
InformationResponsehasbeenreceivedfromthe
UE.

Useaprotocolanalyzerto
lookforprotocolerrorsin
themessagesentbythe
UE.

54

PDNconnectiondoesnot
exist

Duringhandoverfromanon-3GPPaccessnetworkthe
MMEdoesnothaveanyinformationaboutthe
requestedPDNconnection.

None

55

MultiplePDNconnections
foragivenAPNnot
allowed

ThePDNConnectivityprocedurewasrejectedbecause
multiplePDNconnectionsforthespecifiedAPNare
notallowed.

Checkfortheconfigured
protocolatS5orS8inthe
MME.TheS5interfaceis
checkedusingthegsh
get_plmnCLIcommand
andtheS8interfaceis
checkedusinggsh
get_imsinsCLIcommand.
(1)

81

InvalidPTIvalue

IsusedbythenetworkortheUEtoindicatethatthe
ProcedureTransactionIdentity(PTI)providedtoitis
unassignedorreserved.

None

95

Semanticallyincorrect
message

TheMMEregardstheNASmessagefromtheUEas
incorrect.

Useaprotocolanalyzerto
lookforprotocolerrorsin
themessagesentbythe
UE.

EPS SM Cause Codes 6/6


96

Invalidmandatoryinformation

Seecausecode95.

Seecausecode95.

97

Messagetypenon-existentor
notimplemented

Seecausecode95.

Seecausecode95.

98

Messagenotcompatiblewith
protocolstate

Seecausecode95.

Seecausecode95.

99

Informationelementnonexistentornot
implemented

Seecausecode95.

Seecausecode95.

100

ConditionalIEerror

Seecausecode95.

Seecausecode95.

101

Messagenotcompatiblewith
protocolstate

Amessagehasbeenreceived
thatisincompatiblewith
theprotocolstateorthata
STATUSmessagehasbeen
receivedindicatingan
incompatiblecallstate.

Seecausecode95.

111

Protocolerror,unspecified

AnoptionalparameterinNAS
isfaulty.

Seecausecode95.

SGSN-MMME
Troubleshooting
Toolbox Commands

Objectives
Upon the completion of this chapter, the student will
be able to:
Understand the built-in tool useful for
troubleshooting
List the different tools, which are part of the
toolbox, and use these to determine and isolate
faults
Determine the tools available in SGSN-MME

Introduction
The toolbox contains scripts and small programs
which help the operator during their daily work.
Some of the tools can be used to find and isolate
problems on the SGSN-MME
The tools run outside the gsh shell, typically
running at the Unix prompt. Most of the tools
provide online help with the -h option
Use only the toolbox commands listed in Alex
Documentation. Other tools may exist in the
toolbox, but are intended for use by Ericsson
support personnel only.

check_config.sh
Name:
check_config.sh
Description:
The script prints all configuration data for the
SGSN-MME.

Output:
To screen or redirect to text file. See example.
Usage:
check_config.sh > <filename>
Example: check_config.sh > /tmp/DPE_LOG/config.txt

NOTE: Running this command causes heavy CPU load. Run


only during low traffic hours.
This script can take 30 minutes or longer to execute

node_check
Name:
node_check
Description:
This command creates a status overview
since the last startup. It creates one profiling
performance
monitoring job for most
important non indexed counters, and
can collect relevant logs from the current runtime.

Output:

to screen, or logging option sends logs to


/tmp/DPE_COMMONLOG/node_check

Usage:
node_check [-c|-v] [-l|-o|-z] [-p] [-s] [-h]
Example: node_check c

NOTE: Only Root users can run this command. This

command causes heavy CPU load. Run only during low


traffic hours.

Example output of node_check -c


=== root@eqm01s14p2 ANCB log/LogBackup # node_check -c
For a description of all options use /tmp/DPE_SC/LoadUnits/ttx/bin/node_check -h
Checking if node has started completely (via isp.log) ... OK
GSN STATUS
Date
: 2010-09-05 10:23
Node type
: sgsnwg
Node name
: SGSN200
Uptime
: 15:15
Last OS startup
: 2010-09-04 19:09:33
Last node startup
: 2010-09-04 19:15:09
Current Software Configuration
: cxr1010225_4r2a03_pa10
Small local restarts
:0
Small restarts
:0
Large restarts
:0
CM restarts
:0
PM Reboots
:0
Number of nodedumps
: 1 (!!!)
Erlang crash dumps in
:0
Number of DIED proc in ncl.log
:0
Number of "CrashHandler" in app.log
:0
Number of NCS crashes since reload
:0
Number of NCS messages since reload
:3
Timeframe of NCS messages
: 2010-09-04 19:13:29 - 2010-09-05 09:49:58

Example output of node_check -c (contd)


Number of dyn worker crashes since reload : 2 (!!!)
Timeframe of dyn worker crashes
: 2010-09-05 10:13:17 - 2010-09-05 10:23:57
Number of dyn worker messages since reload : 393
Timeframe of dyn worker messages
: 2010-09-05 10:13:17 - 2010-09-05 10:24:08
Connectivity check
PEB check
: OK
GPB check
: OK
nodePdcJob does not exist! It must be created with pdc_setup.sh.

Note: if nodePdcJob does exist, then the node_check command


will also collect and display counters.

clear_dns
Description:
The clear_dns CLI command is used to
clean up DNS cache in the SGSN-MME.
Usage:
clear_dns [-h]

-hThe option -h, help, displays extensive


command information.
Example: clear_dns

Note:
Only root users, and users with the security
management role SysAdmRole, can run the clear_dns
command.

list2get
Description: The list2get CLI command takes the output from
OBM list commands and runs the corresponding get
command, if there is one, for each element in the list.
Otherwise, it just prints out the result of list command.
Usage:
list2get [-h]
-hThe option -h, help, displays extensive command
information.

Examples:
Take the output from OBM list commands and run the
corresponding get command:
list_ip_interface | list2get
You can also filter a subset:
list_ip_interface | grep ETH_2_14_1 | list2get

getAll_ip_if
Description: The getAll_ip_if command shows the traffic state,
the speed, and the errors for each IP interface. This
command
monitors the state via PM job.
Usage:
getAll_ip_if [-i <interval>] [-f <filter>] [-h] [-u]
-hHELP. Displays extensive command information.iINTERVAL. Specifies the interval between two consecutive
executions. The interval is specified in seconds.-fFILTER.
Shows particular Interface_Name, Net, Eq or IP addresses.
Use commas for multiple filters.-uUPDATE. Updates the PM
job.

See next slide for example.>


Note:
PM job is created if it does not already exist.
Do not execute this command again within less than 60 seconds. The
result will be invalid.

getAll_ip_if ctd.
Example:
Show the traffic state of the Gom and Gn interfaces,
with an interval of 60 seconds:
getAll_ip_if -f Gom,Gn -i 60

pm_job_monitor
Description: The pm_job_monitor CLI command monitors a specified
kind of counters, and prints the value with a interval. This command
monitors the state via PM job.
Usage:
pm_job_monitor <owner>|-type <owner> [-i
<interval>] [-fi
<index1,index2...>] [-fc
<counter1,counter2,...>] [ -u ] [ -t ]

-typeOWNER. Bundles more than one types if their names match the
criterion. For example, atm will bundle atmal5,atml,atmpl, and atmtcl. Use
the value NOA for counters with no owner.
-iINTERVAL. Specifies the interval between two consecutive executions. The
interval is specified in seconds.
-fiFILTER INDEX. Shows the indexes that match the criterion (and their
counters).
-fcFILTER COUNTERS. Shows the counters that match the criterion (and their
indexes).
-uUPDATE. Updates the PM job. Update is required to discover new indexes.
-tTRANSPOSE. Inverts the table from Counter/Index to Index/Counter.
Where type= {ss7 pm ospf ipsec ip if icmp gre filter eth bgp atmtcl atmpl
atml atmal5 SYS. SMS. SM. SEC. QoS. MM. ISYSC. IRATHO. HHO. CAM. NOA}

Note:

PM job is created if it does not already exist.

pm_job_monitor ctd.
Example:
Monitor the gsnCpuUsage counter, indexes 1.19 and 1.20,
with an interval of 10 seconds:
pm_job_monitor SYS -i 10 -fi 1.19,1.20 -fc
gsnCpuUsage

dump_dns
Description:
The dump_dns CLI command is used
to dump DNS cache on the active NCB. The result
is
saved into the following file:
/tmp/DPE_ROOT/SiteSpecificData/ApplicationSpecific/dnsApp/named_dump.db.

Usage:

dump_dns [-h]
-hHELP. Displays extensive command information

Note:
Only root users and users with the SysAdmRole security
management role can execute this command.

dump_dns ctd.
Example:
Run the dump_dns command:
dump_dns

Output:
Done, the result is put into:
/tmp/DPE_ROOT/SiteSpecificData/ApplicationSpecific/dnsAp
p/named_dump.db

node_up
Name:
Description:

node_up

Output:
Usage:

To screen or redirect to file.

This command prints ISP summary


information for a specified period of time.
If no parameters are specified, it provides
ISP summary since last node startup

node_up [-h] [-d {all|from_date

[to_date]}]
-h option for help
-d option for user to specify dates

Example output of node_up


=== root@eqm01s14p2 GPB ~ # node_up
2010-08-17 08:36:30 UTC+0200;os_startup;;time_not_synched, eqm01s14p2;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:36:36 UTC+0200;aea;ncl;set AEA;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:37:44
UTC+0200;StartUpAfter_initial_start;ncl;cxp9011380_1r20c15_0_0_All;CXS10127/4_R20C15 (8-00-00)
2010-08-17 08:37:45 UTC+0200;StartUpAfter_node_restart;;;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:38:40 UTC+0200;backup_ncb;ncl;1.19;CXS10127/4_R20C15(8-00-00 )

Example output of node_up d all


=== root@eqm01s14p2 GPB ~ # node_up -d all
2010-08-16 14:10:45 UTC+0200;os_startup;;time_not_synched, eqm01s14p2;CXS10127/4_R20C15(8-00-00)
2010-08-16 14:10:50 UTC+0200;aea;ncl;set AEA;CXS10127/4_R20C15(8-00-00)
2010-08-16 14:11:41 UTC+0200;StartUpAfter_initial_start;ncl;cxp9011380_1r20c15_0_0;CXS10127/4_R20C15(800-00)
2010-08-16 14:11:42 UTC+0200;StartUpAfter_node_restart;;;CXS10127/4_R20C15(8-00-00)
2010-08-16 14:12:43 UTC+0200;backup_ncb;ncl;1.19;CXS10127/4_R20C15(8-00-00)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
2010-08-17 08:36:30 UTC+0200;os_startup;;time_not_synched, eqm01s14p2;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:36:36 UTC+0200;aea;ncl;set AEA;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:37:44
UTC+0200;StartUpAfter_initial_start;ncl;cxp9011380_1r20c15_0_0_All;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:37:45 UTC+0200;StartUpAfter_node_restart;;;CXS10127/4_R20C15(8-00-00)
2010-08-17 08:38:40 UTC+0200;backup_ncb;ncl;1.19;CXS10127/4_R20C15(8-00-00)
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

listSCs
Name: listSCs
Description: Tool to list the software configurations of the
SGSN-MME.

Output: To screen, or redirect to file.


Usage: listSCs

Example output of listSCs


EP_cxr1010225_4r8a06_S70FP00CP07EP607071, 2010-01-30 10:12:47, InstalledCompleted
20071218R7FP00CP07Final, 2010-02-03 16:25:43, CheckpointCompleted
preR8OSPFclean, 2010-02-04 08:22:18, CheckpointCompleted
dallastoolconfig, 2010-02-04 09:48:39, CheckpointCompleted
enableHsdpa, 2010-02-05 11:19:21, CheckpointCompleted
JapanImsiOk, 2010-02-07 14:52:58, CheckpointCompleted
afterCraChange, 2010-02-08 07:19:41, CheckpointCompleted
afterAddingJapanImsin, 2010-02-09 08:45:26, CheckpointCompleted
afterJapanMccChange, 2010-03-11 13:34:27, CheckpointCompleted
afterPlmnChange, 2010-03-15 17:04:51, CheckpointCompleted
cxp9011380_1r20k27_0_0, 2010-04-30 13:12:47, Obsolete
cxp9011380_1r20k27_0_0_merged, 2010-04-30 13:13:08, InstalledCompleted
300408R8postUpgrade, 2010-04-30 15:12:35, CheckpointCompleted
R8License, 2010-05-18 14:20:35, CheckpointCompleted
20100718SC, 2010-07-16 09:24:15, CheckpointCompleted
20100718SC1, 2010-07-16 10:10:14, CheckpointCompleted
20100718SC2, 2010-07-16 11:47:23, CheckpointCompleted (Permanent)
20100718SC3, 2010-07-16 11:51:01, CheckpointCompleted
PreR8FP01CP01, 2010-07-21 08:53:47, CheckpointActive
(Next,LastActivated,LastBooted)
cxp9011380_1r21b06_0_0, 2010-07-21 14:09:02, Obsolete
cxp9011380_1r21b06_0_0_merged, 2010-07-21 14:09:25, InstalledCompleted
Gb, 2010-07-21 15:13:12, CheckpointCompleted

tv_ansi, tv_itu
Name:
Description:
the
readable format

Output:
Usage:
Example:

tv_ansi, tv_itu
Tool to decode trace messages of
SS7 stack into a human
To screen, or redirect to a file

tv_ansi [-options] <tracefilename>


tv_ansi /tmp/DPE_LOG/ss7trace.log

deasn9
Name:
deasn9
Description:
Tool to decode CDR files into human
readable format. For more information, please see the users
guide:
/tmp/DPE_SC/LoadUnits/ttx/lib/\
deasn9_user_guide.txt

Output:
To screen, or redirect to file
Usage:
deasn9 [-d] [-b] [a appname] <input_filename>
Example: deasn9 -b chsLog.99

deasn9 -b /charging/chsLog/ready/chsLog.3 > chsLog3.decoded

Example output of deasn9


sgsnPDPRecord
recordType
servedIMSI
servedMSISDN
sgsnAddress
iPBinV4Address

18'D
311030675001101F'TBCD
1912146751101F'TBCD
AC146941'H

chargingID
ggsnAddressUsed
iPBinV4Address

511000024'D

apnSelectionMode
pdpType
servedPDPAddress
iPAddress
iPBinV4Address

1'D
F121'H

chargingCharacteristics
chChSelectionMode
dynamicAddressFlag
msNetworkCapability
nodeID

0000'H
1'D
1'D
E5C0'H
"SGSN7"'S

0A002E02'H

C0A8FD86'H

Example output of deasn9 (contd)


accessPointNameNI
accessPointNameOI
recordOpeningTime
duration
causeForRecClosing
listOfTrafficVolumes
changeOfCharCondition
qosRequested
qosNegotiated
dataVolumeGPRSUplink
dataVolumeGPRSDownlink
changeCondition
changeTime

"^ipmm2"'S
"^mnc020^mcc440^gprs"'S
0902171424032D0600'H
3600'D
17'D

0003001F0000000000000000'H
011B511F7396405674731040'H
1525'D
591'D
2'D
0902171524032D0600'H

routingArea
locationAreaCode
cellIdentifier
recordSequenceNumber
localSequenceNumber

01'H
0259'H
0001'H
1'D
2616'D

re_activate_pdp.pl
Description:
The re_activate_pdp.pl script enables reactivation of PDP
contexts for the IMSIs in the specified file. In the file, one IMSI
number is
defined per line.
The GGSN IP address can optionally be specified to reactivate only
the PDP
contexts that are connected to the GGSN. The GGSN
address is the GGSN address in use for signaling, that is, the
address provided by the GGSN to the SGSN-MME at PDP context
setup.
Usage: re_activate_pdp.pl h or

re_activate_pdp.pl -f ImsiFile [-g GgsnAddress]


-hOption -h prints help information.-g GgsnAddressVariable
GgsnAddress specifies the GGSN address.-f ImsiFileVariable
ImsiFile specifies the name of the file containing the IMSI
numbers.
NOTE:
The toolbox script for reactivation of PDP contexts is only allowed to start if the
Node Controller Board (NCB) Central Processing Unit (CPU) load is below 40%.

re_activate_pdp.pl ctd
Example:
Reactivate the PDP contexts for the IMSIs included in
ImsiFile.txt:
re_activate_pdp.pl -f ImsiFile.txt
Reactivate the PDP contexts for the IMSIs included in
ImsiFile.txt that are connected to the GGSN with the IP
address 123.123.123.123:
re_activate_pdp.pl -f ImsiFile.txt [-g
123.123.123.123]

fdump
Name: fdump
Description: Tool to force the creation of a nodedump
on the SGSN-MME.

Output: /tmp/DPE_COMMONLOG/NodeDump
Usage: fdump
More information on Nodedump on next 2 slides>

NodeDump
NodeDump is an archive of log files.
NodeDumps are stored in /tmp/DPE_COMMONLOG/NodeDump/
E.g. /tmp/DPE_COMMONLOG/NodeDump/NodeDump-200903111432.tar.gz
NodeDumps are created at e.g. PM-failures, Small Restarts,
Large Restarts.
A NodeDump can be manually created by running ndump or
fdump.
Unpack by gzcat NodeDump-20090311-1432.tar.gz | tar xf

inflateND.sh can be used to decode the ringbufs. The scripts


will used the proper TZ to the get the right time at decoding.
/vobs/gsn/product/test/system_test/scripts/bin/inflateND.sh
cd NodeDump-20090311-1432/PM
inflateND.sh

NodeDump Content
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
rw-rw-r--

8 ervhatr users
5 ervhatr users
48 ervhatr users
1 ervhatr users

4096 Mar 11 20:33 App


4096 Mar 12 13:03 NCB
4096 Mar 12 11:12 PM
47 Mar 11 20:32 reason.txt

App directory contains data from the sub systems Link


and Routing.
NCB directory contains various files from the active
NCB.
PM directory contains ringbufs from all processors.
reason.txt contains triggering reason, e.g Manual
"forcedump
NodeDumps from MKVI also contains a FSB directory
which contains message files, debug files and ringbuf
from FSBs.

Nodedump > App Directory


drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x
drwxrwxr-x

2
2
2
2
2
2

ervhatr
ervhatr
ervhatr
ervhatr
ervhatr
ervhatr

users
users
users
users
users
users

4096
4096
8192
4096
4096
4096

Mar
Mar
Mar
Mar
Mar
Mar

11
11
11
11
11
11

20:33
20:33
20:33
20:33
20:33
20:33

tsApp
ipsecApp
routApp
Link
filterApp
dnsApp

The files info.<hostnames>.txt in the routApp directory contains a lot of important


information; mRouteShow, arpShow, feVpnStatShow, etc.
With help from this information e.g. missing routes can be found.

The named_dump.db is included in the NodeDump;

dnsApp/named_dump.db

Broking Index (bi) tool

Broking Information (bi) tool gives details about


internal indices based broking for APs and DPs

Internal ttx command.

Tool syntax for AP indices:


bi -ap [indices|replicas|details index[,index2,...,indexn] |
dist {all|ap[,ap2,...,apn]}|check]

Tool syntax for DP indices:


bi -dp [indices|nstored {all|dp[,dp2,...dpn]}|details
index[,index2,...,indexn]|check]

Command Options for bi ap

Syntax
bi -ap [indices|replicas|details index[,index2,...,indexn] |
dist {all|ap[,ap2,...,apn]}|check]
Options:
indices: prints the AP index distribution
replicas: prints the replica distribution
dist ap: prints the replica distribution for AP <ap>
details index: prints details for given index/indices
check: checks if the indices are evenly distributed
and each index has a replica.

Example Output for bi ap


=== root@eqm01s14p2 ANCB ~ # bi -ap indices
Broking Information - Index Distribution
--------------------------------------------------------------------------------------------EquipmentID
No Indices
%
Indices
--------------------------------------------------------------------------------------------1.15.2.1
32
6.25%
5
12 25 64 65 72 84 90 109 127 165 185
200 212 244 301 302 323 332 389 391 392 435 448
472 473 485 489 497 507
1.12.2.1

32

6.25%

9
15 23 37 53 59 66 76 124 133 134 155
176 196 210 214 241 252 327 328 354 355 358 397
427 441 443 450 460 487

...

=== root@eqm01s14p2 ANCB ~ # bi -ap details 2


Broking Information - Details
-------------------------------------------------Index
AP
Replica
-------------------------------------------------2
1.13.2.1
1.11.2.1

Command Options for bi dp


Syntax
bi -dp [indices|nstored {all|dp[,dp2,...dpn]}|details
index[,index2,...,indexn]|check]

Options:
indices: prints the DP index distribution
nstored DP: lists the Cids for not yet stored
connections for specified DP
details index: prints details for given index/indices
check: checks if the indices are evenly distributed

Example Output for bi dp


=== root@eqm01s14p2 ANCB ~ # bi -dp nstored 2.2.2.1
Broking Information - Not Stored Connections
-------------------------------------------------------------EquipmentID
Cids
-------------------------------------------------------------2.2.2.1
34232
...

=== root@eqm01s14p2 ANCB ~ # bi -dp check


Broking Information - Check Indices
ID (DP)
No Indices
Diff(Avg:42)
Distribution
-------------------------------------------------------------------------------2.10.2.1
42
+0
OK
2.13.2.1
42
+0
OK
2.16.2.1
42
+0
OK
2.17.2.1
42
+0
OK

...
Broking Information - Not Stored Connections
-------------------------------------------------------------EquipmentID
No Cids
-------------------------------------------------------------2.10.2.1
0
2.13.2.1
0

...

getPatchStatus
Name:
Description:

getPatchStatus

Output:
Usage:

To screen, or redirect to file.

Tool to retrieve the status of the patches


on the SGSN-MME.
getPatchStatus [-l] [-scp] [-cp] [-f <file>] [-a]

[-SC]

Example:

getPatchStatus -l

show_tables.pl
Name: show_tables.pl
Description: Tool to display internal system tables
possibly useful for troubleshooting

Output: Where specified in the -f parameter.


Usage:
show_tables.pl f /tmp/showtab.log

SGSN-MMME
Troubleshooting
Restart Levels

Objectives
Upon the completion of this chapter, the student will be able to:
List and interpret the different restart levels
Explain and react on the escalation procedures on the SGSNMME
Explain and manage the different HW and SW recovery
functions of the SGSN-MME
Describe Session Resilience

Restart Hierarchy
The recovery function in the SGSN-MME is
implemented as a hierarchy of restart levels.
A failure triggers the lowest probable level that can
resolve the problem.
If a restart level is unsuccessful at resolving a fault,
the restart level is escalated.
All restart levels (except connection restart) trigger
an alarm or event.

SGSN-MME Main Restart Levels Overview


Subscribers performing control signaling (e.g. to HLR) are said to be in an
unstable state and during any restart will be detached
Type

Connection
Recovery

Software
Reloaded

Payload Cut Off

Time taken

Network Signalling

AP Takeover

Yes

No

No

15-30 secs for


single subscriber

N/A

DP Takeover

Yes (except for


unstable subs)

No

No

3-6 secs for


single subscriber

N/A

Connection
Restart

No

No

`Yes

One single
subscriber
removed

Delete PDP Context


Request sent to GGSN

Small Local
Restart

Yes

No

No

<10 secs

1 AP PIU restarted,
unstable subscribers
removed (payload also)

Small Restart

Yes

No

No

<30 secs

All APs restarted


unstable subscribers
removed (payload also)

Large Restart

No

No

Yes

<60 secs

All APs restarted


ALL subscribers removed
(payload also)

Node Restart

No

No

Yes

7.5 mins

All APs and DPs


rebooted, pre-defined SC
loaded

Restart levels with manual invocation

AP/DP Takeover is manually triggered by cli command gsh block_eq

NCB Failover (1/2)


If a NCB failover is triggered, the passive NCB
becomes active. Note that a manual restart of the
SGSN-MME might also swap roles of the active and
passive NCB.
A NCB failover triggers a small restart.

NCB Failover (2/2)


The NCB failover is triggered by:
A hardware fault on the active NCB, which will obstruct
communication between the passive and the active NCBs.
A hardware fault in the redundant Ethernet backplane,
which will also obstruct communication between the
passive and the active NCBs.
The Node Control Logic (NCL) on the passive NCB detects a
failure of the NCL on the active NCB and triggers a NCB
failover within 30 seconds.
A failure of the Equipment Management Agent (EQMA) on
the active NCB triggers an NCB failover after 13 seconds.
The former active NCB reboots after 10-15 seconds.

FSB Failover
When the master FSB fails, file operations on the HANFS file systems are held for a few seconds and then
continue. No data is lost from a FSB failover.
The boot service running on the failing master FSB is
started on the standby FSB as it assumes the role of
master FSB. This may cause booting PIUs to retry boot
operations before succeeding.
Note: This section is only applicable for SGSN-MMEs with
standalone FSBs (MkV hardware and later). On MkIV, the file
and boot services are run on the NCB.

Hardware Recovery Overview


If a hardware fault occurs on a PIU, the SGSN-MME has
procedures to handoff the failing PIUs processing to another PIU:
Subscriber sessions may be preserved with a takeover procedure.
(AP takeover or DP takeover)
For hardware that supports failovers (e.g. NCB or FSB) the services of
the active PIU are taken over by the passive (standby) PIU

After takeover/failover, the failing PIU will be rebooted with a PM


restart. The following results can occur:
The processor is restarted successfully and brought back into
service.
If the processor can not be restarted successfully, and the PIU is
blocked.
The processor is restarted successfully, but fails again. Another PM
restart is triggered. This type of repeated restart will result in
escalation.

PM Restart
A PM restart is the restart of a Processing Module (CPU) on a
PIU.
Can be applied to either AP or DP.
Causes reboot of the PIU at the operating system level.
Is interpreted by the SGSN-MME control system as a hardware
loss of the PIU.
The PIU is returned to service if it restarts successfully. The PIU
is blocked if it is unable to restart successfully.
Six PM failures on the same payload DP within a given time
interval will block the PIU.
Twelve PM failures on any combination of payload DPs within a
given time interval escalates to a large restart.

Hardware Recovery: APs


If a hardware fault occurs on the active NCB, an NCB failover is
performed.
NCB Failover is described in the SW recovery section of this module.
A small restart is performed as part of NCB failover.

If a hardware fault occurs on the passive NCB, a small local


restart is performed on the board.
If a hardware fault occurs on an AP that is not an NCB, an AP
takeover is triggered, followed by a PM restart.

Hardware Recovery: DPs


If a hardware fault occurs on an a DP handling
payload, a DP takeover is triggered, followed by PM
restart.
If a hardware fault occurs on a DP that acts as an IP
router or has SS7 Front and Back Ends, it continues to
PM restart until it starts successfully or is removed and
replaced.

Frame Relay Stack and Devices


Failure on a processor handling FR stack and devices results
in PM Restart.
The BVC devices on the processor redirect traffic from the
failing FR device to other FR devices according to the
configuration of redundant NS-VC for each NSE.
Connections and PDP contexts are maintained, but some
packets may be lost before redirection is complete.
Signaling traffic for connections handled by other devices
on the restarting PM are blocked during recovery. (Other
devices are MS, GTU, Charging.)
After successful restart, the NS-VCs are reset to initialize
them for normal traffic.

Payload IP Stack, GTU, MS, BVC, Charging


Failure on a processor handling Payload IP stack, GTU
device, MS device, BVC Device, or a Charging Device results
in DP takeover followed by a PM restart.
DP takeover moves stable connections whose software
component (MS or GTU) are located on the failing processor
to another DP.
Unstable connections whose software component (MS or
GTU) are located on the failing processor are removed.
Signaling traffic on the restarting PM is blocked during
recovery.

IP Routing
If an error occurs on a processor handling routing,
the payload devices will send the IP packets via
the remaining router PIUs.
Some packets will be lost before the payload IP
stacks are notified of the restart of the processor.
The router PIU will be automatically used for
traffic when it is available.

SS7 Device, Front End, Back End, NMM


Failure on a processor handling an SS7 back end and device
results in a PM restart. During the time the PM restart is in
progress, the SS7 capacity of the SGSN-MME is reduced.
If the Network Management Module (NMM) fails, it is
restarted on another PIU. All SS7 traffic may be briefly
interrupted during the last part of the restart period when
links to the SGSN-MME can be disconnected due to timer
expirations in remote nodes. Otherwise NMM failure does
not affect SS7 traffic capacity.
Failure on a processor handling SS7 front ends result in a PM
restart. Some packets may be lost before the traffic is
redirected to other IBxx PIUs or the processor recovers.

Overload Protection
The Overload Protection (OLP) mechanisms for
averting uncontrolled packet loss in overload
situations are as follows:
GPRS Mobility Management OLP with Prioritization of
Payload Users (GSM Only)
SS7 OLP for Outgoing Traffic
Application Processor OLP
Controlled Opening of the Gb Interface (GSM Only)
Generic Device Processor OLP
Ethernet Backplane OLP

OLP handling for AP


AP1

AP2

OLP OK

OLP OK

AP3

AP4
OLP OK

OLP will constantly measure and keep track of CPU resources


(load and memory) on APs.
When AP takeover is requested, the OLP information is used to
make the decision to accept or reject the AP takeover. This is to
avoid overload of the remaining APs in the node.

Session Resilience Mechanisms


Control
SAU
board 1
board 1

Control
SAU
board 2
board 2

cxt 1
cxt 2
repl 3

cxt 3
repl 1

Control
SAU
board
board

Payload
board 1

context info 1

Demand context
info

context info 2

Download context
info

context
info 1

SAU
Control
board 3
board 3
repl 2

Payload
board 2

context
info 2

Control board:
Continuous Subscriber (context
info) replication between control
boards. All boards have the same
CPU load!
Payload:
At loss of payload processor:
Context info is downloaded from the
control board to the other remaining
payload boards which will continue
the session.

No or little session interruption if boards are blocked or taken


out of service

No or little SAU loss if boards are blocked or taken out of


service

Automatic recovery and load balancing after de-block

Subscriber Handling with Resilience


AP indices and DP indices are groups of subscriber
connections.
A subscriber is handled by one AP index and one DP index.
The association of AP and DP indicies to AP and DP PIUs is
stored in the BIT Broking Index Table.
Per-subscriber entities - such as CID, TLLI, PTMSI and TEIDs have a AP and a DP index encoded to enable node internal
stateless routing of signals. (Enables quick mapping of a
subscriber through the BIT to the specific AP or DP PIU that is
serving them.)
Since every AP index has an original instance, and a replica
instance, this means that every subscribers connection has a
replica in the system which can be activated in case of an AP
loss.

Session Resilience: Before AP Takeover


AP1

AP2

R1

AP3

AP4

O1
O2

R3

O3

Index Broking and MS Replication


O1

Original original connection data

R1

Replica replicated connection data on a Replica AP (next AP)

R2

Session Resilience - AP Failure


AP1

AP2

O1

AP3

AP4

O1
O2

O3

O2

O3

When a failure occurs on AP 3, the Broking Index Table is updated to


convert replicas on other APs into original connection data.
BIT table is updated one index at a time during take over for low end
user impact

Session Resilience - AP Takeover


AP1

AP2

O1

R1

AP3

AP4

O1
O2

R2

O3

O3

O2

R3

New replicas are created for all the newly converted (activated)
originals.

Session Resilience: Before DP Takeover


AP1

AP2

AP3

AP4

CM restart is replaced by
a DP takeover procedure

MS Context
PDP context
SMS context
Charging context

DP1

DP2

DP3

Router

DP4

DP Takeover - PM Failure Detected


AP1

AP2

AP3

AP4

CM restart is replaced by
a DP takeover procedure

MS Context
PDP context
SMS context
Charging context

DP1

DP2

DP3

DP4

PM failure payload
DP

Router

DP Takeover - Update Routing and BIT


AP1

AP2

AP3

AP4

CM restart is replaced by
a DP takeover procedure

MS Context
PPDP context
SMS context
Charging context

DP1

DP2

DP3

DP4

PM failure payload
DP

Router

Broking and routing is updated

DP Takeover - DP Requests Contexts


AP1

AP2

AP3

AP4

CM restart is replaced by
a DP takeover procedure

MS Context
Request for
context to AP

PPDP context
SMS context
Charging context

DP1

DP2

DP3

Router

DP4

SGSN-MME DP Context not found


AP1

AP2

AP3

AP4

CM restart is replaced by
a DP takeover procedure

MS Context
PPDP context

Download of context
on demand
DP1

DP2

DP3

Router

DP4

SMS context
Charging context

Restart Summary
Failure Type

PIU

Result

Escalation

HW

Active NCB

NCB Failover + Small Restart

Small Large Node Restart

HW

Passive NCB

Small Local Restart on NCB

Small Local Small Large


Node Restart

HW

AP that is not an NCB

AP Takeover + PM Restart

Auto Blocking of PIU

HW

DP with payload

DP Takeover + PM Restart

Auto Blocking of PIU

HW

DP with SS7 FE, SS7 BE,


IP Router

PM Restart

No Escalation - Continues PM
restarts until card is replaced.

SW

Active NCB

NCB Failover + Small Restart

Small Large Node Restart

SW

DPs with payload

DP Takeover + PM Restart

Auto Blocking of PIU

SW

DPs with SS7 FE, SS7 BE,


IP Router

PM Restart

None - Continues PM restarts until


card is replaced.

Multiple PM
restarts

More than 12 PM restarts


total on any combination of
cards in the SGSN-MME

Large Restart

Large Node Restart

SGSN-MMME
Troubleshooting
Support Information

Objectives
Upon the completion of this chapter, the student will
be able to:
Explain the fault handling and CSR escalation
Determine if the fault is related to a configuration or
software error
Isolate the fault
Correct the fault if it is a configuration error
Write a CSR, which contains all needed information
for the next support level

General (1/2)
A Customer Service Request (CSR) is a request to get a
solution for a problem or question.
There are three different CSR types:
Consultation, which is a question or request from the customer.
The solution is an answer to the question
Problem, which is the default value of a CSR. The solution is
either a remedy or a restoration
Internal, which is a internally found problem or consultation that
needs to be logged as a CSR. The internal type must be used
when issuing Emergency Correction (EC) requests

General (2/2)
A Trouble Report (TR) is written if the analysis of the
CSR shows that there is a fault in the product or the
documentation. TRs should only be written by second
line support.
An EC request is a CSR, which is written to request an
EC for an already existing TR.

General CSR Rules


The following rules apply for every CSR:
All requests and question escalated from second line
organizations must be sent as CSRs with Service Management
System (SMS)
Always report only one problem/consultation per CSR
Standard slogans must always be used
All CSRs must contain a link to a Primus solution or solution frame
When referring to any document, document number and chapter
must be mentioned.

CSR Content (1/2)


Every CSR must contain the following information:
Business Partner
Customer
Contact person
Installed Base
Network = GSM/WCDMA/LTE
Node = SGSN-MME/CGSN/GGSN-MPG/CPG
Site = Name of the node/site
CSR Slogan

CSR Content (2/2)


Every CSR must contain the following information:
Severity
Emergency
High
Medium
Low
CSR Description
Necessary Attachments

CSR Slogan
The Slogan gives the CSR a meaningful name
The use of Standard Slogans enables GSN PLM to
categorize related CSRs
It is necessary to keep the keywords short, since
SMS has a limit of 40 characters
The following Standard slogan has to be used:
<GSN release> <AC-A/CP level> <description>
Example: S80 Possible number of RNC

The <GSN release> input string is:


SGSN2010BW for SGSN-MME 2010B (W)
SGSN2010BG for SGSN-MME 2010B (G)
SGSN2010BDA for SGSN-MME 2010B (DA)

CSR Slogan for upgrade/update


support
The following Standard slogan has to be used:
<GSN release> <SW level> Upgrade/Update support
for <Planned GSN release> <Planned SW level>
Example: SGSN2010BG A02 Update support for A04

CSR Severity Emergency


The following events are classified as Emergency CSRs:
Two or more Node Restarts/Large Restarts on one node within 24
hours
Four or more PM Reboots on one node within 1 hour
12 or more Small Local Restarts/Small Restarts on one node within
24 hours
Disturbance causing more than 30% of service unavailability for
more than 1 hour
Complete loss of O&M (including alarm handling) for more than
four hours
Complete stop of Performance monitoring for more than four hours
Charging disturbance causing charging inaccuracy for more than
2% of CDRs

CSR Severity High (1/2)


The following events are classified as High CSRs:
Any single Node Reload (automatic or escalated)
Faults that require a manual Node Restart to be cleared.
Single automatic Large Restarts
Infrequent faults that require a manual Large Restart to be cleared.
Seven or more automatic PM Reboots within 24 hours.
Nine or more automatic Small Local Restarts/Small Restarts within
24 hours.

CSR Severity High (2/2)


The following events are classified as High CSRs:
Disturbance causing more than 10% of service unavailability for
more than 1 hour.
Complete loss of O&M (including alarm handling) for less than four
hours
Complete stop of Performance monitoring for less than four hours
Charging disturbance causing charging inaccuracy for less than
2% of CDRs for commercial operators.
Mobility Management, Session Management and IP Routing
functions systematically failing for more than 30 % of the
subscribers.

CSR Severity Medium (1/2)


The following events are classified as Medium CSRs:
Less than seven automatic PM Reboots within 24 hours.
Less than nine automatic Small Local Restarts/ Small Restarts
within 24 hours.
Connection abort of single subscribers.
Disturbance causing less than 10% of service unavailability for
more than 1 hour.
Partial disturbance of O&M
Partial disturbance of Fault Management
Incorrect Performance monitoring

CSR Severity Medium (2/2)


The following events are classified as Medium CSRs:
Minor inaccuracy in charging data.
Mobility Management, Session Management (including LI
functionality) and IP Routing functions temporarily not
working, or failing for 5-30 % of the subscribers.
Documentation faults or missing documents that could
result in serious node handling errors and/or node outages.
Node cannot handle requested Load.
Performance problems
Security problems
Any attempt of upgrade or update, which has or will result
in a rollback

CSR Severity Low


The following events are classified as Low CSRs:
All other faults.
RCA - Root Cause Analysis for faulty HW units
Faults that do not cause any downtime or service outage.
Cosmetic faults
Spelling faults, incorrect printouts
Faults resulting from manually provoked negative tests.
Faults in documentation that does not result in serious node
handling errors and/or node outages.
Improvements

CSR Description Template


CONTACT INFORMATION
Country:
Customer:
FACTS
GSN release:
SW level:
HW platform:
Loaded ECPs:
RECENT CHANGES
SW changes:
HW changes:
O&M procedures:
Environment:

CASE DESCRIPTION
Case description:
Problem frequency:
Problem reproduction:
Problem effects:
Network diagram to illustrate
the problem:
Ref. to Alert:
ATTACHMENTS
Type of attachments:
MEASURES
On-site/online support:
Work around:

Request Upgrade/Update Support


Template
CONTACT INFORMATION
Country:
Customer:
2nd line technical contact (name and phone):
2nd line management contact (name and phone):
1st line technical contact (name and phone):
1st line management contact (name and phone):

FACTS
Node name:
Node location:
GSN release:
AC-A/CP level before
installation:
HW platform (e.g. MK level):
System properties:
Loaded ECPs:

PLANNED CHANGES
Planned GSN release:
Planned AC-A/CP level:
Planned HW platform (e.g. MK level):
System properties:
Planned ECPs:
UPGRADE/UPDATE DETAILS
Planned SW installation date:
Upgrade/update procedure:
Remote login details to node:

Root Cause Analysis Template


CONTACT INFORMATION
Country:
Customer:
FACTS
GSN release:
SW level:
HW platform (e.g. MK level/J20
chassis serial No):
Loaded ECPs:
Board type:
Serial number:
RECENT CHANGES
SW changes:
HW changes:
O&M procedures:
Environment:

CASE DESCRIPTION
Case description:
Problem frequency:
Problem reproduction:
Problem effects:
Network diagram to illustrate
the problem:
Ref. to Alert:
ATTACHMENTS
Type of attachments:
Target date for returning the
board:

Attachments
All attachments that are necessary to find a solution to
the CSR should be attached to the CSR tab
Documents
For a CSR that is not reporting a problem but a question
or request, the mandatory log files are not required.
Not more than 50 Mb can be attached to SMS.
Log files and information obtained from traces and
further fault analysis should be attached.
The following compressing programs are accepted:
compressed tar - tar.Z
gzip'd tar - tar.gz
win zip - .zip

The mandatory log files for SGSN-MME 2010B are given


on the next slides.

Attachments for a Node Reload


The following log files has to be attached to a CSR for a
Node Reload:
Log backup covering to the time of the fault should be collected.
Located in
export/Core/log/LogBackup
NodeDump (if any) covering the time of the fault
should be collected from the NCB that is active after
the fault occurred. Located in
export/Core/log/< NCB >/NodeDump

Attachments for a CM Restart/PM


Reboot
The following log file has to be attached to a CSR for a
CM Restart/PM Reboot:
NodeDump covering the time of the fault should be
collected from the NCB that is active after the fault
occurred. Located in
export/Core/log/< NCB >/NodeDump

Attachments for Backup Problem


Log backup containing all GBS-commands.
Located on GIS server:
/gsn/log/GBS_log

Attachments for other faults (1/4)


The following files have to be attached to a CSR for
any other faults:
dyn_worker_crash_report*[1-5]. Located in
/export/Core/log/eqmMMsPPp2/dyn_worker*[1-5]

fm_event.*. Located in
/tmp/OMS_LOGS/fm_event/~

fm_alarm.*. Located in
/tmp/OMS_LOGS/fm_alarm/~

OMS_SM_Log.*. Located in
/tmp/OMS_LOGS/OMS_SM_Log/~

PDC. Located in
/tmp/DPE_COMMONLOG/PDC/archive/*

~ Stands for directories /ready and /tmp

Attachments for other faults (2/4)


Printouts from
/<ttx path >[1]/app_log
/<ttx path >[1]/eqma_log
/<ttx path >[1]/erl_log
/<ttx path >[1]/ncl_log
Collect logs from NCB that is/was active during the time of
the fault.
[1] /tmp/DPE_SC/LoadUnits/ttx/bin/

isp.log. Located in
/export/Core/log

ss7trace.log. Located in
/export/Core/log/eqmMMsPPp2/

Attachments for other faults (3/4)


Monitoring/* Located in:
/export/Core/log/eqmMMsPPp2/monitoring/*
Collect logs from both active and passive NCB

Customer CLI scripts


Any customers CLI running on the node

Any available trace logs


E.g. available Erlang- or C- trace logs

Any available protocol analyzer logs


Should be readable ASCII and original file, specifying file
type and analyzer model.

Printout from listSCs. Executes command


/tmp/DPE_SC/Scripts/listSoftwareConfigurations

Attachments for other faults (4/4)


Printout from getPatchStatus l and getPatchStatus
cp. Executes command
/tmp/DPE_SC/Scripts/getPatchStatus

Printout from /check_config.pl. Executes command


/tmp/DPE_SC/LoadUnits/ttx/bin/check_config.pl

Printout fromgsh check_config and


gsh export_config_active
See Alex CPI documentation for details.

Das könnte Ihnen auch gefallen