ZXCTNPTN Troubleshooting

ZXCTN\PTN Troubleshooting R2.
Training Materials (Level A)

Bearer Product Support Dept.
Wang Yuankai
ZXCTN\PTN Troubleshooting R2.0
Version Update Drafted By
1.0 Wang Yuankai
2.0 Added troubleshooting in terms of L3VPN, MLPPP, DCN and Wang Yuankai
high CPU usage;
Added packet capture skills and perfected the diagnostic
procedure in accordance with the actual situations.
2.2 Added troubleshooting methods of the ZXCTN 6500, updated Zhao Keyan
Related Documentation
troubleshooting of PWE3-CES, TMP-OAM, and bridging for
L2\L3, added single-service/multiple-service fault handling
ZXCTN MPLS-TP OAMupdated
instructions; Troubleshooting Manual
troubleshooting of the TDM services on
the bridging for L2\L3; updated some typical features of the
ZXCTN L2VPNtoolTroubleshooting Manual
set on the EMS, and updated some typical examples.
ZXCTN CES Troubleshooting Manual

ZXCTN L3VPN Troubleshooting Manual
ZXCTN Physical Interface and Hardware Troubleshooting Manual
ZXCTN Alarm and Performance Manual
ZXCTN AGENT Feature Manual R1.0
ZXCTN Narrowband NNI Troubleshooting Manual
Course Objectives
 At the end of this course, you will be able to

 Understand and be familiar with common failure
location methods;
 Locate services and query alarms and performance
through the EMS;
 Reinforce knowledge systems regarding PTN, associate
principles, implementation methods and failure
phenomena, and have a better understanding of
principles and equipment.
Requirements for Troubleshooting
 Understanding the following local network
operation conditions:
 Network monitoring approaches on the EMS
 Service models carried by the network
 Main network protection methods
 Network clock synchronization methods
 Understanding and mastering the following skills
and knowledge:
 Equipment structure
 PTN principles and common protection modes
 OAM and the alarm signaling flow
 Monitoring methods between the PTN EMS and NEs
 Operations on the PTN EMS (to reduce failure location time)
 Use of common instruments and tools
•Basic Habits in Troubleshooting
1. View end-to-end service alarms on the EMS;

2. Use the service manger to analyze alarms of the
services traveling a specific path;
3. Analyze problem causes by viewing NNI/UNI-port
performance and counts of sent/received packets;
4. Use the fault diagnosis tool and the fault
information collection tool to analyze problems;
5. Use common commands to analyze causes of
underlying failures;
6. Summarize, sort and share cases
Troubleshooting Policies
 Policy 1: Seizing common phenomena
Sorting and analyzing common phenomena and
summarizing common features of the faulty
services, such as traveling sites and ports;
 Policy 2: Analyzing core factors
If a large amount of services are faulty, you should
remain calm and make a careful end-to-end
analysis on several faulty services. Generally,
removal of some faults helps to discover a series
of failures.
 Policy 3: Combining policy 1 and policy 2 to query
end-to-end service-layer paths.
Course Contents
 Chapter 1 Requirements for Troubleshooting

Skills
 Chapter 2 General Principles of Troubleshooting
 Chapter 3 Emergency Troubleshooting Methods
 Chapter 4 Processes and Methods for Handling
Common Failures
 Chapter 5 Examples of Typical Failures
 Chapter 6 Knowledge Point Chart and Summery
Start
General Principles of Troubleshooting
Observe and record the
failure phenomena
Collect information related to

the failure Artificial operation
Evaluate the failure by
experience and analyze it
theoretically
Physical fault
List the probable causes
Check the causes

Equipment/EMS fault
Is the fault Contact technical support

removed? personnel
Summarize and analyze this
case and upload to the case
Provide solutions
database
End
Artificial Errors Include:
 Configuration errors
 Accidental deletion/configuration alteration. Analyze the
EMS or logs to discover the operator.
 Accidental access to some equipment which may send a
large amount of broadcast packets, resulting in high CPU
usage, disconnection of NEs from the EMS, or even
startup of line cards. Determine the line cards with CPU
thread-crossing alarms, select the CPU thread-crossing
alarms by using the alarm customization feature of the
EMS, analyze the related information such as the ports
with a large amount of APR packets and use ter mo
command to view the events related to the information.
Physical Problems Include:
 Line card startup or rack startup due to voltage

instability
 Optical fiber physical interruption
 Equipment startup due to poor grounding

EMS/Equipment Failures Include:
 Software running failure or active/standby tunnel

switching failure
 Hardware line card or main control board failure
 Accidental deletion of services configured in the

EMS
Interruption of a Large Amount of Services
 Physical fault of optical fibers
 Broadcast storm
Eliminate the broadcast sources. In the ZXCTN9000 (V32R3)
and ZXCTN6000 (V1.10P), the broadcast storm due to default
L2 SW startup is prevented.
 Equipment power off
 Equipment configuration loss (periodical backup of startrun is
recommended)
Locate the faulty node, attempt to compare data and execute
download
 Service failure after the main control boards are switched
Locate the faulty node, switch boards, or delete the
corresponding configurations and then configure them again
Single Service Interruption
 Physical faults of optical fibers, equipment power off
 Service configuration fault
 For example, PW control word consistency, E1 service sequence
No., service driver not loaded, L2VPN MTU configuration, lack of

L2VPN MTU, port MTU configuration, static MAC configuration,
ARP configuration, or speed limit configuration
 Equipment configuration loss (periodical backup of startrun is
recommended)
 Locate the faulty node, compare data and execute download;
 Service application scenario faults, including:

 Abnormal PW mode and AC access mode in heterogeneous
services, abnormal access mode selection; some protocol packets

in service packets are sent to the CPU, causing service transfer
failure
 Single Node Fault
 Node switching modules or clock modules are abnormal
Start Single Service/Multiple Service
Interruption Handling Methods
Are single service
or multiple services Single service is
interrupted? interrupted
Multiple services are

interrupted Check service
configurations
Check whether
multiple optical fibers are
broken Check whether the fault
occurs in special
Check whether the following faults scenarios such as
are raised in the core nodes: protocol packets, and
abnormal configuration, power off, heterogeneous service
CPU surge, key line card fault, scenarios
clock module fault or switching
module fault
Check whether the single-
node switching module
Check whether broadcast and the clock operate
storm exists properly
End
Static Service Fault Diagnosis and Location on the
EMS, Step 1
1. Locate services in the service manager
2. Use service extension information in the service manager (associated CIP/PW)
3. Query client-layer, service-layer, or local-layer alarms based on services
4. Locate services in the channel view
5. Query MAC learning of VFI services
6. Use the traffic statistics function
7. Fast query local-layer, client-layer, service-layer, history-layer, and pass through

performance
8. Analyze the protocol link failures in history alarms
9. Handle the damaged or abnormal services
10. Apply the fault information collection tool (this version only supports the
Static Service Fault Diagnosis and Location on the EMS,
Step 2
11. Use the statistics by frequency function to fast analyze a large number
of instantaneous alarms raised in the network and locate the equipment
by alarm codes.
12. Collect network robustness within a period past by using the history
alarms (analyze whether port traffic threshold-crossing alarms, CRC
thrashed crossing alarms, TMP/TMC OAM alarms, and PWE3 alarms are
raised in the network).
13. Fast discover the Ethernet ports with CRC errors in the entire network.
14. Query TOP CPU usage and arrange them in sequence on the EMS.
15. Discover the network-wide ports with an optical receiving power lower
than -20dbm on the EMS.
16. Use the traffic statistic function to analyze whether a service is

interrupted.
17. Fast query services based on traffic classification (VLAN).

Course Contents

Skills
Common Failures
Start
Emergency Troubleshooting Procedure
Is the fault caused
Roll back to restore the
by any artificial
operation? services
Does any important

NE power off, is any Modify service paths (or
board not in position, switchover)
or NNI alarm raised?
Check the UNI optical

Are UNI alarms power/negotiation status and
raised in the key restore them; if MSP/ PW dual-
nodes? homing is configured, perform the
corresponding switchover.
Is NNI-traffic Eliminate abnormal traffic like

transferred broadcast storm or switch the
properly? services to another path;
If the CPU usage of the

main control card is too
Is the fault high, switch the main
eliminated? control board over
Contact technical support
End personnel
Course Contents

Skills
Common Failures
Replace line cards
Replace boards Replace one main

control board
Replace two main

control boards
Replace the whole

equipment
Routine
Maintenance Process routine alarms and faults: sort and
Operations handle the alarms
Check alarms requiring

attention/performance threshold-crossing
alarms
Check performance tasks

Alarms Requiring Attention on the EMS (1)
 Service alarms:
1. TMP
2. TMC
3. PWE3-CES
 Hardware alarms:
1. Active/standby MAC synchronization failure
2. Board off-line
3. Board hardware failure
4. Power board failure
 Performance Threshold-Crossing Alarms
1. CPU usage threshold-crossing
2. Memory usage threshold-crossing
3. Port traffic threshold-crossing
4. Abnormal performance count threshold-crossing
 Clock Alarms:
1. Clock locking
2. Clock sub-card damage
 Physical port alarms
1. Half-duplex
2. CRC error threshold-crossing
3. Ethernet port DOWN
4. Ethernet port LOS
The above alarms can be classified and processed

by area;
Alarm Classification Rules
Alarm Classification Results
Alarm and Performance Report Description
 Alarm report
 In the ZXCTN6200/ZXCTN6300 (V2.0) and later
versions, alarms on the EMS are reported through
QX interfaces in TCP instead of SNMP
 Performance acquisition
 In the ZXCTN6000 (V2.0), the alarms are reported
through QX interfaces in TCP. In the earlier
versions, the alarms are reported through the
SNMP get mode
Performance Parameters Requiring Attention
on the EMS
1. Ethernet port CRC count
2. Port bandwidth usage
3. Port optical power
4. Abnormal performance drop count
The above contents are involved in the performance template

CES troubleshooting procedure
CES performance degradation handling procedure
Static L2VPN Ethernet service interruption handling

procedure
Static L2VPN Ethernet service performance degradation
handling procedure
MPLS-TP OAM troubleshooting procedure
Common Failure MPLS-TP tunnel protection, PW protection

troubleshooting procedure
Handling Procedure
Dynamic L3VPN troubleshooting procedure (China
Unicom TP)
Static L3VPN troubleshooting procedure (China Mobile

LTE)
L2+L3 failure diagnosis procedure
Monitoring MCC/DCN failure handling procedure
Clock/time failure handling procedure

Line card startup failure/line packet error handling
procedure
Steps of Static Service Fault Diagnosis and Location
1. Locate services in the service manager
2. Use service extension information in the service manager
(associated CIP/PW)
3. Query client-layer, service-layer, or local-layer alarms based on
services
4. Locate services in the channel view
5. Query MAC learning of VFI services
6. Use the traffic statistics function
7. Fast query local-layer, client-layer, service-layer, history-layer,
and pass-through performance
8. Use OAM LT/LB
9. Use the LM function
10. Process damaged/abnormal services
11. Use the fault information collection tool (this version only
supports the WINDOWS system)
12. Use the fault diagnosis tool
Command Lines of Static Service Fault Diagnosis and Location
 In global mode:
 Service type:
1. Configure and query tunnels, PWs, services, CIPs, OAM, and tunnel
protection
2. Query PWs, CIPs, OAM traffic statistics in command line
3. Use Ter mo to query information and print information
4. Analyze whether common protocol packets are sent and received
properly in debug mode
5. Query port traffic
 Monitoring type
1. OSPF information statistics includes routing table query, OSPF
neighbor setup, and DATABASE information
2. Perform the trace action on IP packets
3. Use the Ter mo command to query information and print information
Advanced Command Lines of Static Service Fault Diagnosis and Location
Diagnosis Mode
1. Check whether PWE3-CES packets are sent/received properly and
whether clock are locked properly in diagnosis mode.
2. Check whether services (tunnel, PW, L2VPN) are sent properly
(SSP/driver) in diagnosis mode
3. Check whether OAM alarms are reported properly, and whether
packets (SSP/drive) are received/sent properly in diagnosis mode
4. Check whether tunnel protection groups are switched in diagnosis
mode
5. Check loss of received/sent packets on the current port in diagnosis
mode
6. Query traffic of received/sent packets to/from the CPU in diagnosis
mode
7. Query packet loss causes and packet loss ratio in diagnosis mode
8. Check the Ethernet port underlying layer status
9. Check the MSP underlying layer status in diagnosis mode
10. View the line card hardware error count
Fault Handling Action Set 1—TMP Alarm (1)
Observe periodically new TMP alarms (similar to other alarms)
Confirm alarm report time
View location of the services related to the current alarms
Fault Handling Action Set 1—TMP Alarm (2)
View the related service-layer alarms
Because 80% of alarms are service-layer link interruption, broken NE links or
port down, and 15% of alarms are related to ARP/MAC configuration errors, the
LT function can be used to locate the fault.
Fault Handling Action Set – Enabling Traffic Statistics Query (1)
Alter the specific Ethernet services to collect the traffic;
Launch services to meet the E1 service requirements and then modify them to collect
PW traffic.
Fault Handling Action Set – Enable PW Traffic Statistics Query (2)
Enable the traffic statistics switch
Actions on the EMS: Service-Based Fault Location
Action Set 1
Select an NE and query the services that terminate at the local NE: Select to query the NE in
the AR loop and then select the corresponding ETH service to expand the service layer.
Actions on the EMS: Service-Based Fault Location
Action Set -2
Select the corresponding service and apply OAM LB/LM/LT (The 40P01 SP002 and later
versions support the LT function).
Actions on the EMS: Monitoring Traffic Packet Loss
Use the LM function to detect packet loss in the network. Ensure that the service packet shall
be identical with LM PHB in other layers. If no TMP LM traffic is carried, the sent/received
TMC-CV traffic can be used as a reference.
Actions on the EMS: Handling Damaged
Services
Periodically handle damaged services: determine creation time and creation person, and
then delete the services.
Query PW Traffic in Command Line Mode---
ZXCTN6000/ZXCTN9000, ZXCTN6500 supports querying PW
traffic statistics
NK3G-TPCR52-9801#show mpls l2transport vc pw 1 de
Service name: 366
Service type: VLL
Local interface: cip: 367 ETH
Destination address: 3.2.2.2, VC ID: 18, VC status: up, FRR: null
Tunnel label: TE tunnel connected
Output interface: null, imposed label stack { 569384 }
Create time: 86d21h, last status change time: 86d21h
Signaling protocol: LDP, peer 3.2.2.2 up
MPLS VC labels: local 569483, remote 569384
Group ID: local 0, remote 0
MTU: local 1500, remote 0
Remote interface description:
Sequencing: receive disabled, send disabled
PW flow statistic: on
inBytes: 0 inPkts: 0
outBytes: 0 outPkts: 0
Query CIP Traffic in Command Line Mode
NK3G-TPCR52-9801(config-cip367) #show mpls l2vpn ac cip 367
cip367
C/S Attribute : Server Interface
Learn MAC number: 0 Static MAC number: 0
AC flow statistic: on
inBytes: 0 inPkts: 0
outBytes: 0 outPkts: 0
Note: The ZXCTN 6500 supports querying sub-interface traffic in the morning, and the
document will be updated after new methods are provided.
Query the Total Number of Services on
Equipment in Command Line
NK3G-TPCR52-9801(config-cip367) #show mpls l2vpn service all b //6k, 9k
Total l2vpn instance number is 268
Instance-name Instance-type Service-type Ethernet-mode
385 vll ETHERNET vlan-all
382 vll ETHERNET vlan-stripping
381 vll ETHERNET vlan-stripping
ZXR10#show l2vpn brief //The command used to query the L2VPN on the ZXCTN6500
VPLS count: 1 VPWS count: 2 VLSS count: 0 MSPW count: 0
name type Default-VCID PW AC description
6 VPLS - 1 1
5 VPWS - 1 1
9 VPWS - 1 1
ZXR10(config) #show l2vpn summary
The summary information about configured L2VPN:
vpn type configure/maximum
VPLS 1/256
VPWS 2/4096
MSPW 0/32768
VLSS 0/2048
Common MPLS-TP OAM
faults
MPLS-TP OAM Troubleshooting
Procedure
Analyze the causes of the

NNI link alarms
Service layer (link

layer alarm Ethernet/MLPPP links
handling)
Classify and analyze

Refer to the table below
common OAM alarms
Is the fault
eliminated?

End personnel
Probable Causes of MPLS-TP OAM Faults (1)
A B C D RNC
Base Station
H G F E
Fault Type Probable Fault Causes Probable Alarms
NNI Port bandwidth congestion on individual sites on Tmp-loc
the network
Damaged configuration on the network TMP-MISMERGE
CRC raised in a segment of link on the network TMP-MISMERGE, OAM alarm vibration
LOS/LINK DOWN or other faults raised on the TMP-LOC
network links
MLPPP channel fault on a network link TMP-LOC, MLPPP LINK FAILED
Configuration ARP, MAC, and port attribute VLAN configurations TMP-LOC
are faulty.
Tunnel configurations are damaged TMP-LOC/RDI
The OAM parameters on two ends are inconsistent Inconsistent TMP alarms
Connection of optical fiber s is faulty TMP-Mismerge/LOC
Speed limit is configured on network nodes TMP-LOC
Service drivers are not sent properly TMC-LOC
Probable Causes of MPLS-TP OAM Faults (2)
A B C D RNC
Base Station
H G F E
Fault Type Probable Fault Causes Probable Alarms
Equipment The CUP usage of the line card which the tunnel passes through is TMP-LOC/TMP-RDI; alarm may vibrate
too high. frequently
Hardware chips/FPGA/clock sub-card of individual nodes are faulty.
Tunnel switchover failure causes service failure TMC-LOC

The control word is not selected. TMC-LOC
ZXCTN6500: After tunnels are configured in end-to-end mode, LOC TMC-LOC
alarms are raised at one end. This error occurs in version R1B14 and
will be removed in version R1P01.
The collection tool is used to collect the ZXCTN6263 protection group No alarm
data and it discovers that the protection group was bound to services,
which are not deleted completely and some data is remained,
causing service interruption after switchover.
If tunnel mode is not configured for the ZXCTN9000 (“tunnel mode No alarm
traffic-engineer static” is not configured), when the version 32R3 is
loaded, the configuration under this tunnel will not be loaded, but the
configuration related to meg will be configured as well, causing OAM
configuration failure.
ZXCTN9000 OAM status machine timer is abnormal and cannot Abnormal alarms
report the OAM status to the software. This fault can be removed
when the version is updated to version 32R3B70.
MPLS-TP OAM Fault Diagnosis Actions
Step Action Items Sub-Action Handling Methods
1 Determine whether Determine whether Query end-to-end tunnel alarms and ensure that the
NNI physical-layer physical-layer alarms are alarms are consistent and the reported alarms on each
alarms are raised raised.(Ethernet node are correct. View the SNMP configuration if
interface/MLPPP) necessary.
Determine whether service Hardware status alarms (including hardware status
layer line cards/main and board not in position). The alarm template function
control boards are faulty. is recommended.
2 Determine whether Ethernet negotiation Handle Ethernet DOWN alarms, such as viewing
NNI link-layer alarms (electrical interface, optical optical links or replacing optical modules.
are raised port)
MLPPP negotiation Refer to MLPPP alarm analysis
3 Common OAM fault TMP-LOC Check whether the configurations of the two ends on
handling procedure by the EMS are identical.
class
TMC-LOC(SS-PW) Check whether the configurations of the two ends on
the EMS are identical.
TMC-LOC (MS-PW) Check whether the configurations of the two ends on
the EMS are identical.
TMP/TMC-Unexpectedmep Check whether the configurations of the two ends on
/Unperiod the EMS are identical.
TMC-CSF Check whether LINK DOWN on UNI.
TMP/TMC-MNG The root alarm is LOC. If services are not affected,

check whether RDI occurs on the alarm module or the
underlying layer.
TMP/TMC-RDI The root alarm is LOC. If services are not affected,
check whether RDI occurs on the alarm module or the
underlying layer.
MLPPP(E1) Fault Diagnosis and Handling Procedure
Start
Are services Yes Are MP group Yes The MP Modify MP group No Replace the cables or
failure alarms configurations/restart
broken? group fails. the MLPPP protocol. the boards.
raised?
No
Is
PPP_LCP_FAIL The protocols
or Yes configured on both ends Modify the interface
No PPP_NCP_FAIL of MP group members
configurations of the MP
alarm group members.
are not consistent.
raised?
No
Is the E1_LOS or Yes Interface Check and handle No Replace
E1_AI alarm signals are physical problems
the board
raised? lost. such as cables.
No
Are service
Yes Does PPP-related Yes PPP links are Check network
No Replace
performance cables or
packets lost? faulty. PPP links.
exist? boards.
No No
Does ML_PPP
Yes MP group member Modify the MP group No
group-related Replace
delay exceeds the member that exceeds
performance the board
threshold. the threshold.
exist?
No
MP group bandwidth
Expand the maximum No Replace
reserved bandwidth of
cannot meet requirements the board
the MP group.
Are service errors

Yes Check PDH alarms Service channel No Replace cables or
raised? and performance bit errors boards
No Is the fault
eliminated?
Yes
Contact ZTE
technical support.
End
Analysis of MLPPP(STM-1) Services
Start
Are services Yes Are MP group Yes The MP Modify MP group No Replace the cables or
failure alarms configurations/restart
broken? group fails. the MLPPP protocol. the boards.
raised?
No
Is
PPP_LCP_FAIL The protocols configured
or Yes Modify the interface
No PPP_NCP_FAIL on both ends of MP group configurations of the
members are not consistent. MP group members.
alarm
raised?
Are
SDH optical Interface Check and handle
No interface LOS,
Yes No Replace
optical signals physical problems
LOF alarms the board
are lost. such as cables.
raised?
No
Is HP_SLM or Yes High-order or Modify No Clear physical

TU_AIS/VC12 low-order configurations
link faults.
alarm raised? channel fails of signalC2
No
Does No
Are service
Yes PPP-related Yes PPP links are
Check Replace optical
network fibers, cables or
packets lost? performance faulty.
PPP links. boards.
exist?
No
No
Does
ML_PPP group- Yes MP group member Modify the MP group No Replace
related delay exceeds the member that exceeds
the board
performance threshold. the threshold.
exist?
No
MP group Expand the maximum No Replace
bandwidth cannot reserved bandwidth of
the board
meet requirements the MP group.
Are service errors

Yes Check SDH Service 否
Adjust the optical Change optical
alarms and channel bit power into the optical fibers, cables, or
raised?
performance errors power range. boards.
No Is the fault
eliminated?
Yes
Contact ZTE
technical support.
End
MLPPP Link
Analysis of MLPPP 6263 E1 Services
negotiation
fault diagnosis
Ensure that the platform is

Is the service configured properly and the Is the fault
No underlying layer driver is
driver sent? eliminated?
created properly
Yes Collect information and No

Determine the ask for technical support
negotiation
packets flow
direction.
Locate the faulty Use the main control board Is the fault
equipment and to locate faults in PPP Yes
the faulty module packet receiving/sending eliminated? Yes
No
Check the statistics
of sent/received Use the line card diagnosis Is the fault
packets of line card command to determine Yes
modules to locate fault type ( FPGA or LLP) eliminated?
the fault
No
Check Ensure that MLPPP connection Is the fault

connectivity of is correct through port alarm and Yes
setting loopback eliminated?
MLPPP lines
Collect the fault

information and ask for No End
MLPPP(E1) Fault Handling and Diagnosis Procedure
ingress ML packets ingrsTransmitCntr
TX
Main
When services are
framer AAL1 SIB FPGA Control
Board
carried on the MLPPP
board carries, the line
RX card modules receive
egress ML packets Engress recv MPLSpkts
egrsTransmitCntr
egrsReceiveCntr and send packets.
Action Items Sub-Action s
1 Check whether Run the show ppp multilink command to check the status of each
negotiation of MLPPP PPP link. If any MLPPP group link negotiation fails, check whether the
group links is successful. MLPPP connection is correct. If yes, determine the negotiation failure
cause based on the PPP packet sending/receiving status of the
MLPPP line card modules..
2 hpcount (bDevNum, Displays performance statistics (including service packets and protocol
bLinkNum) packets) on a PPP link channel specified by the ML/MC/PPP HDLC
module in the LLP AAL1 chip.
3 allhpcount (bDevNum, Displays all performance statistics information on all PPP links in a
bBundleNum) specified MLPPP group, and the statistics of all the PPP links to
analyze whether load in the MLPPP group is shared evenly.
4 hperrevtget (bDevNum) Displays statistics on HDLC error events . After the statistics result is
read, after the command is run. Analyze the statistics information after
running the command several times.
5 mlpppfpgastats This command is provided by the FPGA card to collect information on
(bDevNum, the packets sent/received by the specified MLPPP group.
bBundleNum)
Key MLPPP Command Lines (1)
Uplink(Ingress) is statistics on the service packets received by the
FPGA card in an MLPPP group in the uplink direction:
lookup table fail drop pkts: number of dropped packets due to
table lookup failure
pkts over 9K bytes drop pkts: number of dropped packets due to
packet size exceeding 9K bytes.
recv ppp pkts: number of received PPP packets.
recv 2-layer pkts: number of received layer 2 packets.
recv IP pkts: number of received IP packets.
recv MPLS pkts: number of received MPLS packets.
recv no-MPLS pkts: number of received non- MPLS IP packets.
Downlink (Egress) includes two types of statistics:

1 Statistics on normal service packets received by an MLPPP
group in the downlink direction:
recv ppp pkts: number of received PPP packets.
recv 2-layer pkts: number of received layer 2 packets.
recv IP pkts: number of received IP packets.
recv MPLS pkts: number of received MPLS packets.
recv no-MPLS pkts: number of received non-MPLS IP packets.
2 Statistics on error service packets received by MLPPP group in
a line card in the downlink direction:
recv pkts (include correct and error): number of received packets
(correct and error packets).
lookup table fail drop pkts: number of dropped packets due to
table lookup failure.
RAM insufficiency drop pkts: number of dropped packets due to
lack of logical FIFO.
If the items with drop are collected, the FPGA card receives error
packets. Contact R&D personnel for troubleshooting. If the above
statistics items contain nothing, the FPGA card does not receive
packets. You need to use the spi3count command of LLP and the
packet sending and receiving command of the main control board
to check the cause of FPGA packet receiving failure.
This command is provided by the
FPGA card to collect packets
sent/received by a specified PPP
link. The statistics meaning is
identical with that of
mlpppfpgastats. In normal
conditions, only uplink/downlink
recv ppp pkts are collected to
show uplink/downlink PPP
packets sending and receiving
status. Downlink all port statistics
is the sum of the collected
MLPPP groups and PPP links.
MPLS-TP OAM Troubleshooting Procedure
Alarms on Alarms on Probable Causes Handling Methods

End A End Z
1 TMS, TMP, TMC UNPhb, The configurations on Ensure that MEP
UNPeriod, UNMep both ends are configurations on both
inconsistent. ends are consistent.
2 TMC-CSF None The end Z- Check LINK DOWN on the
corresponding UNI UNI corresponding to end
LINK DOWN. Z and handle it.
3 TMP/TMC- None The alarm on end A is Perform LB check on the
RDI not cleared; MEG. If loopback is
The alarms on end Z successful, the alarm is
are screened reported mistakenly. If
loopback fails, go to the
LOC handling procedure.
4 TMP-LOC TMP- If services cannot be Ensure that ARP and MAC
RDI/TMP- transferred from end Z configurations are correct
LOC to end A, ARP or MAC and ensure that end-to-end
may be configured tunnel is complete. For the
incorrectly or the end- details procedure, see the
to-end tunnel is figure below.
TMP-LOC Diagnosis Actions
Actions Sub-Action Handling Methods
1 Check Check whether the two ends of MEG Query the E2E OAM on the EMS and ensure that OAM
basic are configured correctly, and check channel TYPE configured on the two ends are consistent.
EMS whether CV/CC is selected.
configura
Check whether end-to-end tunnels Query whether end-to-end services are damaged on the
tions
are complete and correct. EMS. If yes, go to damaged service handling.
Check whether VLAN configurations Check the VLAN attributes and ARP configuration on
of end A-Z are correct. interconnected nodes. In the versions earlier than
ZXCTN6000 1.1P (version 1.1P excluded), if VLAN layer 2
switching on NNI is disabled, the OAM packets cannot be
transferred properly.
Check whether ARP configurations Check the end-to-end MAC and ARP configuration is correct.
(or MAC) of end A—Z are correct. Ping the IP address of the service interface to check whether
the MAC/ARP is correct. ZXCTN 6100 does not support
PING IP address. You can use the LT function to fast locate
the fault.
Use the alarm template to check the information that
Check the alarm template on the cannot be acquired from service-layer alarms, such as
EMS hardware faults (not in position, board chip error), high
CPU of line cards, port traffic-crossing.
2 Analyze NNI: check optical power of Ethernet The template can be used to improve performance query
NNI path ports, CRC statistics, and traffic efficiency. It is recommended to query link ports with CRC
performa threshold-cross; count (The end-to-end view of this version does not support
nce querying performance by template.)
NNI: check link-layer MLPPP The template can be used to improve performance query
performance efficiency. It is recommended to query link ports with CRC
count (The end-to-end view of this version does not support
querying performance by template.)
Handle the high CPU usage of the Query path performance and collect CPU information on the
TMP-LOC Alarm Fault
TMP-LOC/RDI alarm
Handling Procedure
Check ARP and
MAC configurations
In the service provisioning In the service maintenance

Check whether PE- phase phase Check whether the
OAM configured on equipment or the
both ends are correct. control board is
replaced but the ARP
Use end-to-end is not modified.
analysis of the EMS
to check the tunnel
Check basic EMS configurations Check manual operations
completeness Check whether the
tunnel is deleted at
Check whether NNI one end.
ports have the trunk
service VLAN. Check whether the
MAC of the
Query NNI path performance
equipment and that
of the NE on the
The optical fibers are
EMS are inconsistent
connected incorrectly.
The actual connections
and that on the EMS
are inconsistent. MMG
alarms may be raised. Delete the faulty OAM and send
Query OAM on the underlying layer of the PE node
the OAM parameters again.
Determine whether to
use the no basic Use the the LB/LT
configuration function. function the check P
node fault.
Is the fault eliminated?

End personnel
Does Static Tunnel Underlying Driver Take
Effect (ZXCTN9000)
ZXR10(diag) # prjexec drv npc 2 cmdname l3sw_te_show 1
TE 1 :
vpnid 8215 lspId: 0
nexthopip: 0xa010101 rosOutInex: 20
inIntfIndex: 0 bakIndex: 0
rsvpTeFrrInsideLabel: 3
indexLabel: 3 teOutLabel: 841001
lspType 2 curWorkItem: 1
teIsStatic: 1 teNodeRole 1
phyIntf 20 flags 0
inIntf :0:0:0:0
outIntf : 2 : 0 : 13 : 0
TE 1 :
vpnid 8216 lspId: -1
nexthopip: 0x0 rosOutInex: 0
inIntfIndex: 20 bakIndex: 0
rsvpTeFrrInsideLabel: 0
indexLabel: 840001 teOutLabel: 3
lspType 2 curWorkItem: 1
teIsStatic: 1 teNodeRole 3
phyIntf 0 flags 0
inIntf : 2 : 0 : 13 : 0
outIntf :0:0:0:0
value = 1 = 0x1
If no information is displayed or flags are not 0, the node tunnel forwarding item is abnormal.
Does Static Tunnel Underlying Driver Take
Effect (ZXCTN9000) (ZXCTN6263 )
Version 2, .0 Version 1.1
6300.22(diag) #prjexec ssp mp master 63.81(diag) #prjexec l3switch tmpls mp tunnel 2010
tunnel 700 tmpls tunnel 2010 id 2010
----------------------------------
tunnel 700 alarm state : up
------------------------------ admin state : up
prot admin state : up mpls type : tmpls
drv admin state : up owner id : 2010
prot state : up node role : ingress
action : none
detect state : up label space : global
forward state : valid input label : 0
ing lck state : unLock input port : gei_0/0
egr lck state : unLock nexthop type : route nexthop
tunnel list id : 0 driver mpls : 0
bind pw count : 0 driver nexthop : 0
lsp type : bidirect group nexthop : 0
ldp frr group : 0
tunnel type : static ldp ecmp group : 0
tunnel entry index : 5 out inner label : 0
group index0 : 0 out outter label : 81
group index1 : 0 nexthop ip : 50.0.0.92 (838860892)
mpls entry0 : 6 driver state : arp passive 0 //egress status
mpls entry1 : 10 config outport : gei_8/3
The forwarding status “valid” is normal. driver out-port : 19
ZXCTN6500 Tunnel Configuration Check
 show mpls traffic-eng
static
 show mpls traffic-eng
static tunnel-id 100
 Ensure that the tunnel is
up, ingress/egress and
ingress/egress labels are
configured properly. If the
port is not UP, check
whether the port is up and
whether the optical fibers
are connected properly.
OAM Faults on a PE Node (ZXCTN6263)
Steps Action Items Handling Methods Handling Frequency
1 OAM packets are received Enter: diag exec mp ma cmd This command is only valid
by the FPGA card showallmeg in hiding mode for fast OAM.
connected to the driver
2 Check whether OAM is Enter: diag exec mp ma cmd Check whether
configured properly in driver. showmegindex xx in hiding mode GeCfgErrorCnt,
GeTxHashCnt or
GeRxHashCnt exists.
3 Check whether the FPGA Enter: diag exec mp ma cmd iFpgaSsramWriteFail : 0
card connected to the main SspFpgaShowDbg in hiding //Record the number of
control board is normal. mode FPGA writing failures. If this
number is any number other
than 0, OAM will be affected.
4 Check whether the line card Version 1.1: Enter: prj drv bcm np Query whether diagnosis
is transferred or received x showoamcount y in diagnosis packets are transferred
properly mode properly between the line
card and OAM.
Version 2.0: Enter: prj drv np x
show port y in diagnosis mode
5 Check whether the packets Prj drv bcm mp show c gexx
are transferred properly Obtaining Gexx: use prj drv bcm
from a line card to a port of mp panelinfoshow x : x to the
the main control board corresponding gexx of the port for
the slot number.
PE Node OAM Failure (ZXCTN9000)
Steps Action Items Handling Methods Handling Frequency
1 Main control Enter prj drv mp ma cmd General

platform MEG Tmpls_oam_show_meg x (x is a
check items megindex) in diagnosis mode
2 MEG items on Enter prj drv np x cmd Relative high, all platform OAM
the line card TmplsOam_help in diagnosis mode diagnosis commands can be
platform viewed.
Enter prj drv np x cmd
Tmpls_oam_show_meg x in
diagnosis mode
3 MEG items on Enter prj drv np x SpTmplsOamHelp Relative high, all diagnosis
the line card in diagnosis mode commands supporting OAM can
be viewed
Enter prj drv np x Relative high, the resource line
SpTmplsOamMegShow in diagnosis card where the actual OAM is
mode located can be viewed
4 Driver MEG Enter prj drv np x drv_toam_help in Relative high, enter the
items in the line diagnosis mode command in diagnosis mode.
card
Enter prj drv np x High, check whether the
drv_tmpls_oam_show_ctrl_info x in underlying OAM status is
diagnosis mode normal.
High CPU Usage
 To handle the high CPU usage, refer to the diagnosis manual of high
CPU usage.
Action Items Handling Methods Handling Frequency
1 View all the NEs that report Define an alarm interface that
CPU threshold crossing in corresponds to the CPU threshold-
the entire network crossing alarm in the entire network or
of important NEs.
2 Check whether the process In ZXCTN 9000 diagnosis mode: Prj
is suspended drv np x cmd i
In ZXCTN 6000 hiding mode: Diag
exec mp ma cmd i
3 Check the detailed process Prj drv np x cmd rosTaskInfoShow This item can be
Prj drv np x cmd tt XXX(process name) viewed on the EMS in
later versions.
Diag exec mp ma cmd
rosTaskInfoShow
4 Check the count of received Prj drv np x cmd showMuxQ(Check the This item can be
CPU packets of the buffer of the queue of received packets viewed on the EMS for
corresponding line card or of the software) convenience of
main control card Prj drv np x cmd showMuxPkt( Check maintenance.
the count of sent/received CPU
packets supported by the software)
Prj drv np x cmd
showMuxDropcnt(Check the loss of the
packets sent/received by software)
Prj drv np x cmd drv_rx_shw(Check
Mirroring Packets Received by the
ZXCTN6263 to the CPU and Printing Them
Steps Action Items Handling Methods Handling
Frequency
1 In diagnosis mode, prjexec drv mp 6263setenable en=1 This action is

enable the prjexec drv mp set qos cpu mirror en=1 supported in later
diagnosis setting //Enable the CPU mirroring switch versions. In version
switch 2.0P01, dropped
packets are mirrored
2 Obtaining the Prj drv mp bcm panelinfoshow x : x is slow
to the CPU and sent
sequence number number
to the EMS. In
of the physical The displayed port number corresponds to
version 2.1, all
port gexx
packets can be
3 In diagnosis mode, prjexec drv mp bcm dmirror gexx mirrored to the CPU
mirror the packets mode=ingress dp=cpu0 and sent to the EMS.
to the CPU : The gexx is the gexx that was obtained
before, such as ge15
4 Start packets diag exe mp master cmd ssp_pkt_cap 1,
capture in hiding 0x0604 :
mode 1 is the capture direction packet 1. 0604 is
port 4 in slot 6
5 Start displaying diag exe mp master cmd ssp_pkt_show
captured results in
hiding mode
TMC-LOC(SS-PW) Diagnosis Actions
Steps Action Items Sub-Action Items Handling Methods
1 Check basic TMC Check whether the two ends of Perform the check on the EMS.
configurations on the MEG are configured properly
the EMS and check whether CV/CC is
selected.
Check whether end-to-end PW
services are configured properly.
Check whether two-end PW
control word is enabled.
2 Service layer Check whether alarms exist on Refer to the TMP alarm handling
check the tunnel OAM where the service methods.
layer is located.
3 Service-layer If the PW is bound to the tunnel Refer to linear tunnel protection
tunnel protection protection group, check whether switching troubleshooting.
group check both ends of the tunnel protection
group are normal.
4 PE node analysis Check whether PE node services Use the equipment cli to check
are configured properly. whether services are sent. For
VLL services, PW OAM requires
that CIP configurations are
complete and correct.
Attempt LB on the PE node. If LB is successful, query the
underlying OAM/alarm
performance module.
TMC-LOC (MS-PW) Diagnosis Actions
1 Check basic Check whether both ends of the MEG Check the EMS.
TMC are configured properly and check
configurations whether CV/CC is selected.
on the EMS.
Check whether end-to-end PW/service is
configured properly.
Check whether the PW control words of
each segment are consistent.
2 Service-layer Check whether alarms exist on the tunnel Refer to TMP alarm handling
check OAM of the service layer. methods.
3 Check the Check the tunnel protection group bound The MS-PW LT function can
service-layer to the PW in each segment and ensure be used to locate the service
tunnel that the tunnel protection group status is layer of the faulty PW end and
protection normal. query the linear tunnel
group protection switching.
4 UPE node Check whether the UPE node is Check the equipment cli
analysis configured properly. (ensure the EMS and
equipment are consistent.)
Check whether UPE node underlying Check whether OAM takes
OAM packets are sent and received effect.
properly.
5 SPE analysis Check whether SPE is forwarded TMC-LT analyzes the SPE
properly. forwarding.
TMP/TMC-MNG
1 Check MEG Check whether MEG-IDs configured on both end Query the
configurations are consistent. configurations on the
on the EMS EMS.
2 Check the Check whether a large amount of link performance Query performance
service layer CRCs of the service layer are reported and on the EMS
whether the optical power is too low.
3 Damaged Analyze damaged configurations on the EMS. Refer to
configurations damaged/abnormal
on the EMS services on the EMS
4 Incorrect MEG OAM MEG IDs can be recognized in later The ZXCTN 6000
ID match versions . The MEG IDS of error packets can be (Version 2.1)
reported and can be searched in the entire supports this action.
network for match.
Fast OAM Packets Sending Procedure
Main control FPGA on the
6200/6300-PE Line card FPGA switching main control
network board
node
Main control
6200/6300-P Line card FPGA switching network Line card FPGA
node
Main control
6100-PE node Line card FPGA
FPGA
Main control NP
6100-P node Line card FPGA Line card FPGA
forwarding
The line card recognizes OAM

packets through ACL when Resource line card micro-
9000-PE node OAM packets pass through code chip
the chip
The line card recognizes OAM The line card recognizes
packets through ACL when NP OAM packets through
9000-P node ACL when OAM packets
OAM packets pass through
the chip pass through the chip
Line card-stream
6500-PE node NP FPGA on the NP side
FPGA
Line card- Main control
6500-P node Line card-series
stream NP NP
FPGA switching network FPGA
After a tunnel is switched, the services are
MPLS-TP Tunnel Protection
abnormal and cannot be restored. The Troubleshooting Procedure
links are normal.
The configurations on both ends must be
Are two-end PE consistent, such as revertive mode, APS
protection group enabling mode, and bound tunnels are
configured the same one (end-to-end configuration,
properly? this fault rarely occurs.)
Check whether both ends meet the

Are two-end expected states; check whether the
switching statuses protection lock is configured, causing
normal? abnormal switching. If yes, clear the
protection lock.
Check whether two-

Send the forcible switching command or
end PE nodes are
send the switching command after the
switched properly on
protection group is configured again.
the underlying layer?
Perform LB check on
the protection tunnel
Eliminate protection tunnel OAM fault.
and check whether it
is normal?
Is the fault
eliminated?
Contact the technical support

End personnel
MPLS-TP Tunnel Protection Actions
1 Check configurations Check whether the To determine whether a tunnel

on EMS protection group protection takes effect, you only need
configurations are damaged to check whether TMC-OAM can be
on the EMS. transferred to the protection tunnel.
2 Check the read Check whether the Refer to actions.
protection status on switching status meets the
the EMS expectation.
3 Check whether the Check whether the If the protection tunnel cannot take
protection tunnel is protection tunnel forms LB effect, enter the OAM diagnosis
normal. and cannot take effect. procedure.
4 Check the underlying Check the underlying layer For the ZXCTN 6000 series, the
layer switching status status. The underlying switching and control statuses exist in
on the equipment statues include platform, the main control board. For the
support and driver. ZXCTN 9000 distribution series, the
switching statuses exist in the main
control board, the working and
protection line cards. You need to
ensure that the statues on these
three line cards are consistent.
5 Delete the protection Delete the protection group
group and configure it and configure it again.
again.
Tunnel Protection Group: ZXCTN 9000 Status
Switching Confirmation
TCT3-TCCR52-9801(diag) #prj drv np 4 cmd l3sw_show_tg_tbl_all
--------The active/standby line cards must be consistent.
groupId workingNo nexthop protectNo curitem

------------------------------------------------------
14 4071 0x0 4070 1 ---- 1: The tunnel protection group status is
received in the active path. 2: The tunnel protection group status is received in
the standby path.
12 4080 0x0 4074 2
10 4076 0x0 4075 2
9 4078 0x0 4077 2
7 4082 0x0 4081 2
3 4093 0x0 4092 1
13 4073 0x0 4072 2
6 4084 0x0 4083 2
5 4087 0x0 4086 2
4 4090 0x0 4089 1
2 4096 0x0 4095 2
-------------total number = 11---------------
The requirements for the active/standby line card status are as above.
Tunnel Protection Group: ZXCTN 6263
ZXCTN 6263 2.0 Version
TC-PTNCH-2081(diag) #prj ssp mp ma tunnel group brief
Group Protect-Type State Drv W-Tnl P-Tnl Frr-Group Tms-Id Protcol

--------------------------------------------------------------------------------
3 1by1 primary success 1 4 1 0 active
6300.22(diag) #prjexec ssp mp master tunnel group 10

tunnel group 10
---------------------------
use state : yes
drv state : success
protect type : 1by1
prot state : active
protect state : primary
group node : end node
working tunnel : 20
protect tunnel : 10
tg list id : 0
frr group index : 1
Tunnel Protection Group: ZXCTN6263
ZXCTN6263 version 1.1
In diagnosis mode, enter prjexec l3switch tmpls mp group <tunnel-group-id>
In hiding mode, enter diag exec mp ma cmd show_group x
Protection Group Status Confirmation:
ZXCTN 6500
 View the protection node data in the model and
debug the following functions:
 ShowNodeProtGrp (protection group type,
protection group ID) .
 Linear tunnel protection type 39
 Linear PW protection type 37
 MSP protection type 40
 SG protection type 47
 Ring network protection group 41
 Shared ring network service node
 Virtual TEFRR protection group 42
View Data of a Protection Group (Protection

ID is 1) in the FTM model (ZXCTN6500)
ZXR10(diag-physical) #execute 104322 39, 1
 ************ node base info ************
 node type: 39[NODE_LINEAR_PROT_LSP_TYPE], node addr: 0xb75909c
nodeType = 22
 child num 0: NODE_LSP_TYPE
 parent num 0: dwTunnelID = 1, dwLSPID = 1, ucDir = 0, bIfOut = 1
 brother num 0: TagLable = 101
 ************ node attr info ************
nodeType = 22
 Prot Group Node 0xb75909c info:
NODE_LSP_TYPE
 key info: dwTunnelID = 1, dwLSPID = 1, ucDir = 1, bIfOut = 0
 ProtType info: 39, ProtID info: 1 TagLable = 101
 Output info:
 ProtId info: globle id: 1024, local id: 1024 Prot Node List:
 ProtState info: 0 (0-work 1- prot) // work activation
nodeType = 22
 ProtSubType info: 2 (1, 2 1: 1 3, 4 1+1) NODE_LSP_TYPE
 RcvSendStrategy info: 3 (1-both_send 3-both_rec) dwTunnelID = 11, dwLSPID = 1, ucDir = 0, bIfOut = 1
 FastSwitchEnable info: 254 TagLable = 111
 dwGlobeID info: 1024 // Global protection group ID in FPGA
nodeType = 22
 dwLocalID info: 1024
NODE_LSP_TYPE
 m_dwHoldOff: -2 dwTunnelID = 11, dwLSPID = 1, ucDir = 1, bIfOut = 0
 m_bIsMCProt: 0 TagLable = 111

 Work Node List: Hold Self Node List: // Hold LSP stream node information of the
protection group

nodeType = 21
NODE_LSP_FP_TYPE
dwTunnelID = 1, dwLSPID = 1, ucDir = 0, bIfOut = 1
View a Tunnel Protection Group
 ZXR10(config) #show tunnel-group 1 // View information of tunnel protection group 1
Tunnel group 1
Protection type: 1: 1 bidirectional receiving both // 1: 1 double-receiving
Protection strategy: aps
Protection section: 0
Working tunnel: 1, state: OK // The working tunnel ID ID is 1
Protection tunnel: 11, state: OK //The protection tunnel ID is 11
Switch: no // No switching occurs.
 ZXR10(config) #show aps linear-protect tunnel-group 1 // View the APS information of the tunnel
protection group 1
----------[APS Linear Instance]----------
Protection group type: tunnel
Protection group id: 1
Protection type: 1: 1 bidirectional receiving both // 1: 1 double-receiving
APS is enabled
APS state: NO_REQUEST_NULL
Protection mode: remote
Active-state: restore-run
Revertive mode: revertive, WTR time: 5min // WTR time
Hold-off time: 0ms, valid hold-off time: 0ms // HOLDOFF time
Switch command: null
APS has no switched action // No switching
PW dual-homing fault
analysis PW Dual-Homing Protection Failure
Troubleshooting Flow
Does Check whether PW mapping alarm is
any alarm trigger selected in service configurations in the
PW switching. remote end.
Check whether CSF or SSF in TMC-OAM

Query remote TMC-
is sent to PW dual-homing switching
OAM configurations.
node.
If switching is not performed, send the

Query whether the
switching command forcibly and configure
local underlying layer
the protection group again and send the
is switched.
configuration again.
Perform
LB detection of the
Eliminate the protection PW OAM
protection PW and
problems
check whether it is
normal.
Is the fault
eliminated?

End
personnel
PWE3-CES Service Faults (1)
STM-1
A B C D RNC
Base Station
STM-1
H G F E
Fault Type Probable Faults Probable Alarms

Client-side Poor grounding CV threshold crossing
Client-signal failure PDH-AIS/PWE3-CSF
Network-side PE node tunnel/PW switching failure after optical fibers are PWE3-LOP
broken
Traffic congestion on the network, causing serious packet loss. PWE3 packet threshold
crossing/PWE3-
On the NNI side, the leasing link is 155M/300M. Traffic burst on LOP/Overflow/underflow
the NNI can cause TDM service loss. events
Too many CRCs in a network segment may cause serious
packet loss
Service-layer tunnel/PW is abnormal TMP-LOC/TMC-LOC
Clock type Sets the self-adaptive clock. The PDVT on the network is large. Overflow/underflow events
Sets the system clock. The synchronous Ethernet is not locked Service locking failure
properly or the network cannot transparently transfer Ethernet
clock after passing DWDM.
PWE3-CES Service Faults (2)
STM-1
A B C D RNC
基站
STM-1
H G F E

Configuration Number of cascading and buffers on both ends are inconsistent. Malformed packets
Type threshold-crossing
One end of two ends is damaged. PWE3-LOP
MSP interconnection configuration at one end is incorrect. MSP switching failure
Multiple E1s in the wireless base stations use MLPPP binding. No alarm is reported.
The larger delay duet to PTN causes wireless SCTP
continuously interruption. The transfer delay needs to be
optimized or the delay parameters need to modified.
Equipment Type Line card not in position or line card CPU usage is too high CPU usage threshold-
crossing and PWE3-LOP;
The sub-clock card on the equipment operates abnormally.
equipment-chip alarm;
Some line card transfer packets abnormally, causing packet sub-clock board alarm
loss or large transfer delay
Abnormal PW-FRR/TUNNEL switching
PW label conflicts: labels of local PW and pass-through PW
are identical, causing CPU surge in version 1.10P01B46 and
later versions
PWE3-CES Service Faults
STM-1
A B C D RNC
Base Station
STM-1
H G F E
Service Faulty Base Station Without Common Faulty Base Station with Common Points
Deployment Points
Model
SS-PW service Probable causes: 1. Forwarding failure of core nodes and core
model 1. Individual base station equipment equipment; service line card failure; service line
failure in the access layer. For example, card failure; equipment switching network card
poor equipment running environment failure;
MS-PW single- causes equipment short circuit, high
service model temperature causes bit errors, optical 2. Core node clock recovery failure, causing that
path degrades in an individual base the sent clock is not the BITS source clock.
station, and narrowband line card
forwarding failure in individual base 3. The service tunnel between the core node and
DNI-PW service
station. the base station is DOWN, causing PW DOWN, or
model
2. Failures of the NNI/UNI line cards on non-deployed association causing the failure.
some expansion equipment or
forwarding failures cause packet loss. 4. The core node MSP operates improperly,
Use Ping. The PW is normal, but PWE3- causing service interruption.
CES are received and sent improperly.
3. The clock recovery is abnormal. The 5. Multiple optical fiber interruptions causing that
OTN link that the PW passes causes protection cannot take effect. Both work PW and
clock recovery failure or clock protection PW are broken, causing that work LSP
synchronization deployment scheme and protection LSP are broken.
failure. As a result, false locking occurs
on the base station and frequency
PWE3-CES Troubleshooting Procedure
CES
services
are
received Check the TMS-OAM LM function segment by
segment to locate the segment that packets are
lost.
The hardware, chip, clock

sub card alarms and the Step C
A-Z alarm 1)Query the sent/received packets by PW and PWE3-CES. If the
environment alarms are
resource analysis service layer is PW, query all packets sent/received by SS-PW and
added to the root alarm
check. PWE3-CES(refer to 15-min performance statistics on the EMS)
The details are as follows:
If service layer is SS-PW, query the sent/received packets on the
PW and PWE3-CES;(15-min performance statistics on the EMS)
Determine root Query whether the PWE3-CES performance statistics contain
alarms abnormal performance statistics and analyze the abnormal
performance.(Information collection equipment gives handling
suggestions.)
2) In accordance with statistics results, further handling suggestions

Check the TG at the service If TMC-OAM is enabled and are given. Collection information on both ends of the PE including
layer: check whether the no OAM alarm exists; line card information collection. (equipment information collection
current service is bound to 1) Check whether all the interaction)
1 Handling UNI alarms and give suggestions for
MS-PW/SS-PW and whether control words and sequence
PDH and SDH slots;
is in the switching status: numbers are selected, Collect TMC-LM,DM of the corresponding PW on demand to
You can determine whether to give options about
1) If the service is bound to number of cascading and determine whether packets are lost in the end-to-end link (PHB=EF).
line card information collection: CPU information
MS-PW, check whether the whether the jitter buffer at
collection is associated;
underlying layer status of the two ends is consistent.
2 If a large amount of CV errors exist on the UNI
tunnel protection group that 2)Check whether the service Step D: Query information on the line card that
side, handle them
each PW is bound to is clock types are consistent. corresponds to TDM. The methods provided by the
3 If CPU threshold crossing of narrowband line
normal. (fault information 3)Check whether the current equipment are as follows:
cards exists, collect information.
·
collection) service is configured with
4 If the service is bound to the MSP, check
2) If the service is bound to PW-FRR and analyze Check whether the service clock is normal (if
underlying layer status (information collection
SS-PW, query the status of underlying layer status self-adaptive clocks are used) and check
interaction)
tunnel protection group at the (information collection) whether the service clock is locked.
·
two ends is normal.(fault Analyze the control word
information collection) alarms If The System clock is used, the synchronous
Ethernet Is configured in the entire network.
Check the clock link E2E locking status and
Check whether TMP-OAM is enabled on
the service layer. If not, configure TMP-
· The alarms only give suggestions on
check whether the clock source has any
unlocked 15-min history alarm.
PWE3-CSF- , check peer-end STM-N or
OAM.
check whether LOS or all 1 alarms are
raised on the E1 interface.
· If the root alarm is PWE3-RDI, and no

PWE3-LOP is reported on the peer end,
1.If TMP-OAM is enabled， you are recommended to enter the PEs
1)When TGs exist and at lest one end has not TMPOAM alarm, at two ends to check whether the
check whether TMC-OAM is configured. If the alarm is raised at underlying layer query services are
both ends, go to the TMP-OAM diagnosis procedure; Query abnormal end-to-end service-layer normal. (information collection)
2)When TGs do not exist, check the current TMP-OAM.
If any alarm exists, go to the alarm diagnosis procedure.
CRC count, if the link performance CRC
of the current service layer increases,
· If the root alarm is PWE3-LOP, go to
Configure alarms. If no alarm exists, perform the TMC-OAM perform the following operations: step C;
check.
2.Enale CV for TMC-OAM; for MS-PW, configure TMC-OAM on
· For non-root alarm but TMD service
the MS-PW. performance degrades on the client ,go
to step D.
PWE3-CES Diagnosis Actions and
Troubleshooting
Action Sub-Action Items Handling Methods
Items
Database Ensure that current AGENT and ROS database Check whether the corresponding services exist
consistency are consistent under the current command line. If not, the service
restored by the AGENT may be faulty.
1 UNI Handle PDH-LOS/RS-LOS/TU-AIS Handle information on the client side.
alarm/perfo
Handle a large amount of UNI-side CVs Handle grounding
rmance
of 2M cable and the equipment.
handling
Handle CPU threshold-crossing of the narrow Use the fault information collection tool to collect
band line card /line card hardware fault alarms the corresponding information.
Diagnose and determine switching of MSP If MSP protection is configured, check whether the
services on the UNI protection status is normal. Check whether MSP
configuration methods on both ends are consistent.
2 Check Check whether PW-FRR is configured or If PW-FRR is switched, check whether the current
service- deployed and check whether switching is path direction is normal.
layer normal (optional)
alarms
In the MS-PW scenario: perform the TMC- Refer to the TMC-OAM handling procedure. If
LB/CV/LT handling procedure TMC-CV is not configured, configure TMC-CV to
check faults.
In the SS-PW scenario: perform the TMC-
CV/LB handling procedure
Check whether the Tunnel is normal and check Check whether alarms exist on the service, for
whether the switching is normal. example, whether the tunnel protection group is
normal.
3 Query and Check whether the service-layer link Query the performance template
handle CRC/optical power/traffic threshold is normal.
service-
CES Troubleshooting Procedure
The key words of the CES services and sequence numbers are as follows:
Non- structured UDT services:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0|L|R|RSV|FRG| LEN | Sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Structured SDT services:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0|L|R| M |FRG| LEN | Sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
CES Service--PW Layer
Alarm Report Mechanism Probable Fault Causes
Alarms
The E1 board does not 1. If one end is configured with services and the
receive packets sent from the other end is not configured with services, the end
main control board within 55 configured with services reports that alarm.
Loss of Packet ms and the channelized
board does not receive 2. Packets are lost on the physical layer. (The low
packets within four times of optical power is low. The hardware board or main
the packet delay. control board is faulty.)
The E1 payloads received at the client-side at the
Client Signal Failure
Received L=1 at local end remote end are all 1, thus L in the PW control word
(CSF)
sent from the remote end to the local end is 1.
After packets are lost at the remote end, R in the PW
PW-RDI Received R=1 at local end
control word sent to the local end is 1.
CES Service Troubleshooting Procedure
The troubleshooting includes the following cases:
No. NE A NE Z Causes
1 Loss of Loss of The TDM line cards at both ends do not receive services
Packet Packet sent from the peer end;
1. Ensure that the tunnel and PW are connected properly.
2. Check whether both ends the corresponding service
packets to the control board.
3. Check whether underlying-layer line cards receive
packets sent from the control board.
2 Loss of RDI TDM packets are not received from Z----A.
Packet 1. Check whether NE Z sends packets to the main control
board.
2. Check whether NE A receives the packets sent from
the main control board.
3 Loss of No alarm Check whether NE Z send services properly;
Packet Note: If both structured services and non-structured
services are configured in one clock domain, the services
probably cannot be sent.
4 RDI No alarm Check whether NE Z sends the alarm suppression and
check whether RDI alarm on NE A is reported by the line
card.
5 RDI RDI The probability of this phenomena is rather low. This
phenomena occurs due to mechanism fault for processing
TDM service troubleshooting tips:
Different from packet services, TDM services are transferred continuously on
the NNI side, that is, no matter whether AIS services are received on the
network side or not, the corresponding traffic will be forwarded. The packet
delay of TDM 2M service (8) on NNI set on EMS by default is1000pps.
It is computed as follows:1s/(125us*8) p=1000pps
The CSF alarms often occur in the TDM services. If this alarm is raised, the
services have been created on the PTN side, but the signals sent from TDM to
PTN are invalid (AIS/LOS) and this alarm is raised on the remote side.
For example:
LOS/AIS/optical fiber unpluged LOS/AIS/LOS CSF
L=1 All 1 in
Tx Rx payload
Met All 1 in
NE1 NE2
er payload L=1
Rx Tx
AIS CSF AIS

An Example of RID and Loss of Packet:
Delete
services
R=1
Tx Rx
Met All 1 in NE1 R=1 NE2
er payload Rx Tx
AIS Loss of Packet
An example of R bit and L bit setting:
Delete services Packet loss

All 1 in
Tx Rx payload
Met
NE1 R=1 NE2
er R=1 Tx
Rx
L=1
LOS（AIS） AIS
CES service performance
CES Service Performance
degrades
Degradation Handling Procedure
Is PD-side
Check whether AIS, UAS, Is the client-side fault
performance
and CV exist. eliminated?
count normal?
Is TM-1/E1
Check whether PW packet
channelized board
overflow/underflow count
performance
exists.
normal?
Is TDM
For the two situations that may
service clock
occur during querying underlying-
recovery
layer clock , see the next page
faulty?
Is the fault
eliminated?

End personnel.
CES Service Performance Degradation
Handling Procedure
Main Clock
Difference of two clocks
for sending TDM services
Packet
Launch P P P Receiving
Buffer
P
Read the data
clock Set the upper limit
Read the pointer
Controller Set the

lower limit
Core ideas of self-adaptive algorithm: 读出数据包
The left-side IWF device sends packets to the destination IWF device based on its own source clock
direction. As shown in the figure above, adjust the frequency for sending TDM signals by setting the
bucket depth.
System clock:
The system clock enables the service clock to send services based on the local clock of the PTN
equipment. To ensure that the client-edge E1 clocks are synchronized, the clocks in the PTN network
must be synchronized, and the PTN equipment must be synchronized with the service-side clocks.
CES Service Performance Degradation Handling Procedure
Clock Probable Cause 1 Probable Cause 2 Probable Cause 3
Mode
System Packet loss on the System clocks on NE A Client-side clock and
Clock network side causes and NE Z are not PTN system clocks are
bit errors on the synchronized. not synchronized. PTN
network side. Check 1) All-procedure clock system clock and client
whether CRC synchronization is not clock have different
alarms occur on locked on Ethernet. clock sources. For
end-to-end links, 2) The network passes example, client 2G, 3G
whether the optical DWDM equipment and conference video
power is too low or (DWDM equipment access the clock source
CPU usage is too does not support OUT- different from the ZTE
high. 2E), but only equipment.
synchronous Ethernet
frequency
synchronization is
configured among NEs.
Self- Packet loss on the Due to severe clock For channelized boards,
adaptive network side causes drift on the CLI side or client services in the
clock sever recovered NNI side, client clock same clock domain may
clock frequency drift signals cannot be have difference sources:
Underlying Layer Actions and Instructions on
the Narrowband Fault Diagnosis Module
Steps Action Items Handling Methods Handling
Frequency
1 Ensure that the main control Query the count of sent/received PW High
board can receive and send packets on the EMS or by command
PW packets properly. line.
2 Ensure that the FPGA Query the count of sent/received High
module on the narrowband PWE3-CES packets on the EMS or
line card send and receive by command line.
PWE3-CES packets properly.
3 Ensure that narrow band chip Refer to the command set in the CES Low
LLP sends and receives service fault diagnosis manual
packets properly.
4 Ensure that this service is
sent properly by the
underlying driver
L2VPN Static ETH Service Faults
GE
A B C D RNC
Base Station
GE
H G F E
Fault Type Probable Causes Probable Alarms

Client Side Ethernet connection failure; half-duplex events LINK DOWN/LOS/ negotiation half-duplex
LACP negotiation failure SG port member failure/SG port negotiation
DOWN
A large amount of protocol packets are received, causing CPU threshold-crossing/notification events of
CPU surge. a protocol
Network Side No protection is configured or protection switching failure TMC-LOC
after optical fiber interruption
Traffic congestion in a network segment causes severe Bandwidth usage threshold crossing.
packet loss
Too many CRCs in a link on the network cause severe CRC threshold crossing
packet loss
Abnormal service layer tunnel/PW TMP-LOC/TMC-LOC
Configuration Control word configuration inconsistency No alarm
Type
Abnormal VLAN processing and forwarding mode No alarm
Abnormal HUB/SPOKE mode setting No alarm
Damaged service configurations No alarm
For the ZXCTN 6100 series, damaged OAM No alarm
L2VPN Static ETH Service Troubleshooting
GE
A B C D RNC
Base Station
GE
H G F E

Equipment The line card is not in position on the equipment or line Card not is position or CPU usage threshold
card CPU usage is too high crossing
Clock sub-card on the equipment operates improperly Hardware clock sub-card alarm
Some line cards on the equipment operates improperly, No alarm; probable due to CRC threshold
causing packet loss or too long delay crossing or loss of packet threshold crossing
PW-FRR switching is abnormal No alarm
Other After traffic surge on the CIP side, the traffic is No alarm
broadcast to all PWs, causing instantaneous service
interruption.
Loopback occurs on the VFI service mode, MAC address flapping alarm
CIP/PW(1.10P 6k is TREE/LAN下VIP/CIP) MAC
address flapping
Single CIP/PW traffic is broadcast to CIP/PW, causing Port traffic threshold-crossing/no alarm
broadcast traffic burst and instantaneous interruption
or interruption of all services
The equipment sends part of protocol packets to the No alarm
CPU and these packets cannot be sent to theL2
channel.
During service deletion and association, the No alarm
L2VPN Static Service Fault Diagnosis Actions
Action Sub-Action Items Handling Methods
Items
Normal Ensure that the current AGENT Check whether the command line
configuration and ROS databases are configuration exists;
consistent
1 Alarm The client CIP is a physical port Check whether the physical port UP/DOWN
processing alarm frequently, and ensure that negotiation rates
on the UNI are consistent.
side
Client CIP is an SG port alarm Check whether LACP negotiation is abnormal
and check whether the configurations on two
ends of LACP are consistent.
2 Abnormal Check whether CRC packets of Check whether cable connection, optical
performance the line card port is increasing. power, and grounding are normal.
processing
Check whether traffic on the line Port traffic threshold crossing: check whether
on the UNI
card UNI port exceeds the broadcast packets exist on the client side.
side
threshold. CPU threshold crossing: check whether the
line card received abnormal packets, or
whether OAM vibrates frequently on the NNI
side of the line card. Use the information
collection tool to collect the information.
3 Service- MS-PW scenario: TMC- Check whether service-layer TMC is abnormal
layer alarm LB/CV/LT troubleshooting and query the previous troubleshooting
handling procedure procedure analysis
SS-PW scenario: TMC-CV/LB
troubleshooting procedure
Service-layer path line card Search hardware board through the
6263 1.1P Traffic Query Method
In ZXCTN 6263 (version1.1)
TC-PTNCH-2081(diag) #prj drv bcm mp vshow service srv=xx
L2VPN_ETH Service Packet Loss Handling Procedure
Start
Check whether Handle physical layer

CRC occurs on the performance and switch
NNI-side link or
whether optical services to improve the
power is too low.. services.
Check whether
broadcast storm occurs Handle the broadcast
on the NNI side or storm/service-side MAC
whether MAC address address flapping and analyze
flapping on the VPLS
side. the service-side network.
Check whether the

speed is limited on
Remove the speed limit and check
the tunnel or PW . whether the services are improved .
The CPU of the main

control boards or line Extract the information and analyze
cards on the end-to- the cause of high CPU usage.
end path are too high.
Is the fault
eliminated?
End personnel.
Packet Loss Errors
Action Optional Action Remark
1 Packet loss on the NNI si On PE1, ping all probable faulty spans that packets pass through to locate the x is line card
de nodes where the packets are lost. If CRC exists on these nodes, handle the C
RC.
slot number; Y
If no CRC exists, check whether the speed threshold crossing exists in the fixe is chip unit, in
d pans. If not, check whether packets are lost as a result of consecutive PING the range of 0
operations. If yes, use trace to locate the range of spans where packets are lo to 5. For
st and use the following detailed steps to check whether the hardware is faulty.
detailed
2 Viewing route oscillation Use the command to perform ter mo, check whether the CPU usage is too hig command
and CPU threshold cross h and whether routs oscillates in the spans where packets are lost and. Perfor
ing on the platform m the notify check.
description,
refer to the
3 Check faulty spans throu prjexec bcm npc slot number unit 0 cmdname pe ------FE_2000 error
gh querying the ZXCTN9 prjexec bcm npc slot number unit 1 cmdname pe ------QE_2000 error ZXCTN
000 line card faults prjexec bcm npc slot number unit 2 cmdname pe -----TME_2000 error Physical
prjexec bcm npc slot number unit 3 cmdname pe ------For full-configured line c Interface and
ards, BCM0～5 need to be queried. Hardware
Prjexec bcm npc slot number unit 1 cmdname ps -------Query all sfi/sci bus sta
tus----check whether the connections between the line card and the main cont Troubleshooting
rol board is normal. Manuals
4 Query faulty spans (ZXC 1.1ver: prj drv np x showoamcount y: y is a port number. Check whether the pr
TN 6000 line card statisti eambles are correct.
cs) 2.0ver: prj drv np x show port y
5 Query faulty spans by ZX 1.1ver: prj drv bcm mp panelinfoshow x

CTN 6000 main control c Prj drv bcm mp show c yy
ard statistics 2.0ver: prj drv bcm mp panelinfoshow x
Prj drv mp bcm show c yy
6 Are the line are often off-l If the line card is often off-line, contact R&D personnel for locating the causes.
ine
Viewing PW/CIP Traffic Statistics for L2VPN/PWE3-
CES Static Services
PE 1 PE 3
P1 PE 2 P2 CE2
CE1
SPE
Steps NE CIP/PWE3-CES/PW Traffic Remark
Forwarding Query
1 PE 1 Query services from PE1 to ZXCTN 6263 2.0 supports statistics on VFI CIP
PE3: query CIP for the traffic but not on VLL traffic. The ZXCTN 9000
Ethernet services and query supports both of them. Before querying the traffic,
PWE3-CES traffic for CIP ensure that the outer tunnel is normal.
traffic. You are recommended to query packet
sending/receiving status on the UNI physical port .
2 PE 1 Query ingress and egress Both the ZXCTN 6000 2.0/ZXCTN9000 support
PW traffic from PE1 to PE2. PW traffic statistics. The ZXCTN 6000 6263 1.1
supports statistics on Ethernet traffic, but statistics
on CES traffic. The ZXCTN 61 1.1 supports real-
time statistics on PW.
3 PE Query ingress and egress The ZXCTN 6263 2.0 supports statistics on
2-1 PW traffic from PE2 to PE1. received MS-PW traffic, but not on the sent MS-
PW traffic.
4 PE Query ingress and egress
The ZXCTN 9000 supports both the received and
2-2 PW traffic from PE2 to PE3.
the sent MS-PW traffic.
5 PE 3 Same as PE1
Probable Fault Analysis on L2VPN+L3VPN
GE
A B C D RNC
Base Station
GE
H G F E
Probable Fault Probable Faults Probable Alarms Phase When

Type Faults Occur
Client side DOWN/CRC threshold-crossing on the client LINK DOWN Provisioning/maint

interface enance
CE-side routes are not set or set incorrectly Dynamic routing neighbor is Provisioning/maint
not UP enance
The line card on the UNI received too many CPU threshold crossing, and Maintenance
protocol packets, causing CPU rise and cannot too many packets of a
learn client-side ARP packets protocol exist
Network side After the optical fiber is broken, protection or TMP-LOC, VPN-FRR Maintenance
tunnel is not configured or VPN-FRR protection switching event
switching is abnormal.
Congestion on a network segment causes Bandwidth threshold-crossing Maintenance
severe packet loss
Too many CRCs on a segment of link cause CRC threshold crossing Maintenance
severe packet loss
Equipment Line card not in position or line card CUP Board not in position Maintenance
usage too high
Clock sub-card on the equipment operates Clock sub-card hardware fault Maintenance
improperly
Probable Fault Analysis on L2VPN+L3VPN
GE
A B C D RNC
Base Station
GE
H G F E
Probable Probable Faults Probable Alarms Phase When

Fault Type Faults Occur
Configurations BGP and VRF configurations are incorrect and the BGP PEER DOWN Provisioning/maint
routes cannot be distributed correctly. enance
The BGP faults includes source address not configured
with loopback and routes distribution errors;
Label assignment and other parameters in VRF are
incomplete.
Outer tunnel interfaces are not configured and traffic No alarm Provisioning
cannot be forwarded.
VPN-FRR switchback time is not configured, causing No alarm Provisioning/maint
service interruption during immediate service switchback. enance
L3VPN static routes are incomplete, and the correct No alarm Maintenance
routes are deleted by mistake (static L3VPN)
L2 service TMC loc causes interruption after switchover, No alarm Provisioning/maint
and the standby bridging point is not configured with enance
static ARP, or standby PW tunnel is abnormal.
L3 service VPN switching causes failures, such as item No alarm Provisioning/maint
switching and platform not switching. enance
Tunnel interfaces and other configurations are deleted, No alarm Provisioning/maint
Probable Fault Analysis on L2+L3 ETH Service (3)
GE
A B C D RNC
Base Station
GE
H G F E
Probable Causes of Base Station Interruption/Packet Loss Causes of Base Station Interruption/Packet Loss
Fault Type Under Different Bridging Points/Terminal Points Under the Same Bridging Point/Terminal Point
One base 1. Check whether the base stations pass the same 1. The VPN-FRR is triggered in the downlink
station, intermediate node, because hardware fault on that direction but the standby bridging point does
one node may cause forwarding fault---- check not have ARP information on the base station,
network hardware alarms in the entire network causing service interruption.
segment
scenario 2. Check public paths of the base stations and 2. The bridging point/terminal forwards the main
public links. Ping the pubic network IP to check control switching data improperly or the NNI line
whether the links are normal----check the CRC card operates improperly– switch over to view
count in the entire network and abnormal optical the hardware status information.
power items or PING packets to query and locate
faulty nodes and faulty links. 3. For the association problem, refer to the above
page.
Multiple 3. Check whether these base stations belong to
base the same RNC, and check whether forwarding of 4. The routing redistribution function is deleted,
stations, the interconnection line card from the terminal to and check the configuration and operation logs.
one the RCN is normal. The RNC line card may be
network faulty. 5. The core terminal and bridging point tunnel is
segment down or link packet loss cause the fault ---check
scenario 4. The routing protocol interconnection between whether the association function or the network-
the core terminal and RAN-CE is faulty. wide CRC count is configured.
All the above information is included:
5. Some core aggregate NEs receive a large
Static L3VPN_ETH Service Fault Handling Procedure
(China Mobile LTE)
Static L3VPN
service
forwarding fails
Y Handle interface Y
Are UNI alarms Is the fault
faultd or CRC
raised? eliminated?
errors
N
N
Y Check and restore local Is the fault
Can VRF PING eliminated?
CE-side configurations
ping through the
remote CE- N
connected
interface address Check and correct the
Is the fault
on the PE side underlying VRF support and the
eliminated?
driver routing table
N Check service-
Are VRF routes Are VRF
Y Y layer TP tunnel
formed? routes
forwarding
formed on the
status/query
remote PE?
N N underlying
routing table
Are static global Ensure that RF GLOBAL

routes and Y N
routing configurations and Is the fault
ingress/egress labels VRF private-network eliminated?
in VRF configured ingress/egress labels are
Y
properly? correct N
N
Is the correct TP
tunnel is
interface-based?
Is the fault Y
End
eliminated?
N
Contact R&D
personnel
Dynamic L3VPN_ETH Service Troubleshooting Procedure
L3VPN service
(China Unicom/China Mobile TP Scenario)
forwarding fails
Y Y
Are UNI alarms Handle interface fault Is the fault
raised? or CRC errors eliminated?
N
N
Y
Can VRF PING ping Check and restore local CE-side Is the fault
through the remote configurations eliminated?
CE-connected
interface address on N
Y
the PE side
Check and correct the underlying VRF Is the fault
support and the driver routing table eliminated?
N
Y
Are VRF routes
Are VRF routes formed Check outer TP
formed?
on the remote PE? tunnel forwarding
status
N
N
N Is outer-TP tunnel
Does the CPU Y status interface-
Ate BGP VPNV4 receive and based? Does Peer Is the fault
routes formed? send packets include in the tunnel eliminated?
properly in the VRF or is the
runnel route to the
remote PE in global
N Check the sent is specified?
YCPU packets and
Is BGP connection analyze the fault N
established?
Are the BGP configured at
N two PE nodes, VRF RT
Are the route to the Y configuration, PE/CE-side
remote BGP neighbor routes, and BGP label
address reachable? configurations correct?
(Is ping successful)?
N
Are the BGP configured at
Analyze IGP routes two ends correct?
N
Y
Is the fault eliminated? End
N
L3VPN Query and Diagnosis Commands
9004-5#show ip vrf de LTE //This command is applicable to the ZXCTN 9000
VPN LTE; default RD 65412: 1
Flow Statistic:
InPktsHigh : 0 InBytesHigh : 167
InPkts : 717201209 InBytes : 2810475404
OutPktsHigh : 0 OutBytesHigh : 172
OutPkts : 736344507 OutBytes : 3499807824
Interfaces:
l3access1
Connected addresses are not in global routing table
Export VPN route-target communities
65412: 1
Import VPN route-target communities
65412: 1
No import route-map
No export route-map
Route warning limit 4294967295, current count 3
VRF label allocation mode: per-prefix
No static outlabel configuration
Static tunnel configuration:
10.0.0.4 5
10.0.0.3 4
GG-6500-4(config) #show l3vpn perfvalue wlbl3vpn // This command is applicable to the ZXCTN 6500
Last Clear Time : 2013-08-06 13: 37: 24 Last Refresh Time: 2013-08-06 15: 42: 40
120s input rate : 6030416Bps 109413Pps
120s output rate: 5808764Bps 109413Pps
In_Bytes 37050962844 In_Packets 673663664
E_Bytes 35703562828 E_Packets 673660644
L3VPN Query and Diagnosis Commands
9008-1#show ip route vrf zzz //This command applies to ZXCTN6000/ZXCTN9000, check the vrf routes
Total number of routes: 4
IPv4 Routing Table:
Dest Mask Gw Interface Owner Pri Metric
0.0.0.0 0.0.0.0 0.0.0.0 null1 bgp 254 8
10.0.0.0 255.255.255.0 10.0.0.5 vlan1101 direct 0 0
10.0.0.5 255.255.255.255 10.0.0.5 vlan1101 address 0 0
10.2.0.0 255.255.255.0 172.2.100.3 tunnel5 static 1 0
9008-1#show ip protocol routing vrf zzz //General

Routes of VPN:
Status codes: *-valid, >-best, s-stale
Dest NextHop Intag Outtag RtPrf Protocol

*> 0.0.0.0/0 0.0.0.0 45066 notag 254 special
*> 10.0.0.0/24 10.0.0.5 45066 notag 0 connected
*> 10.0.0.5/32 10.0.0.5 45066 notag 0 connected
*> 10.2.0.0/24 172.2.100.3 45066 45059 1 static
5
show ip vrf interface // Check the VRF interface status and information. This command applies to the
ZXCTN6000/ZXCTN9000.
9004-1(config) #sh ip vrf interfaces 1
Interface IP-Address VRF Protocol
vlan100 100.1.1.1 1 up
GG-6500-4(config) #show ip forwarding route vrf wlbl3vpn // This command applies to the ZXCTN 6500, and
check VRF routes.
ZXCTN L2+L3VPN Query Steps
PE 1
P1 P2 PE 3
CE1 CE2
L2L3 bridging point
PE 2
Steps Actions Executable Actions

1 Query received/sent PW packets on PW traffic statistics
PE1
2 Query sent/received PW packets on PW traffic statistics
PE 2; for VFI, check VPLS MAC
learning situations
3 Query ARP learning of virtual layer 3 Ping the base station IP address from the bridging
interfaces/sub-interface corresponding point gateway IP address. The backup IP address
to the bridging point on PE2 cannot be ping through.
4 Check tunnel interface status of VRG Ensure that the outer tunnel interface is normal.
route corresponding to PE 2 on the On the bridging point, PING/TRACE IP address of
L3VPN. interfaces of PE 3 and CE2 or RNC logical
Routes obtained by BGP VPNV4 addresses.
corresponding to PE2 Check BGP VPNV4 to obtain routes and BGP link
establishment.
5 VRF routes corresponding to PE3 Ensure that the outer tunnel interface is normal.
Routes obtained by BGP VPNV4 PE 3 uses the IP address of the interface
corresponding to PE3 connecting to RNC as the source IP to PING the IP
address of the base station.
Analyze the MCC
Common MCC/DCN Monitoring Failure
monitoring failure
Troubleshooting Procedure
Check whether the NE type

Enable the NE to be and the version created on
PING/TELNET NEs
offline and online again the EMS are consistent with
the actual version of the NE
On the EMS tracert the IP

address of offline and observe
which hop is faulty
PING/telnet the
Enter the NE to check whether the routing table has
directly connected IP
routes to the IP address on the EMS; check whether
address of the
OSPF routes are correct and whether IP configuration
reachable node of
conflicts exist on the network
the last node
Run the show lldp en command to check whether

Check line card CPU usage neighbors can be discovered; use the show ip ospf nei
and sent/received packets on command to check whether OSPF neighbors can be
the NNI port established and whether the current the ROUTER ID is
the IP address of corresponding interface or the loopback
Switch over the main address
control boards
Is the fault
eliminated?
personnel
DCN Diagnosis Commands (ZXCTN6000)
Steps Actions Executable
Actions
1 L2 status of the port In configuration mode, enter Dcn show l2 For the
detailed
2 Query the route In hiding mode, enter diag exec mp ma cmd diagnosis,
neighbor status show_dcn_ospf_neighbor refer to the
Fast
3 Query the route protocol Enter sho ip ospf database to check the routing
Provisioning
LSDB process 65536
Guide
4 Check underlying In hiding mode, enter diag exec mp ma cmd
routing table status in rt_list 8198 to check the underlying DCN routing
global mode status
5 Check the OSPF In diagnosis mode, enter diag mode mp ma
configurations sent from showdcnconfig
the underlying layer of
the platform
6 Check whether the port In common mode, enter debug ppp pa
sends PPP packets
DCN Commands (ZXCTN 6500)
ZXR10# show dcnbaseinfo
global: 1
qxip qxipmask qxmac qxospfenable qxflood
qxospfarea
0.0.0.0 0.0.0.0 0001.0200.0e01 0 0 0.0.0.0
mngip mngipsubfix ospfarea
198.2.1.154 255.255.255.255 2.0.0.0
show dcnxxx:
dcnfib dcnl2 dcnl3 dcnnbrglobal dcnnbrl2 dcnnbrl3
dcnnbrmngip dcnnbrqx dcnnbrroutaggr dcnnbrstatirout
dcnportstatus dcnroutagg dcnstaticroute dcntopoInfo
DCN Commands(ZXCTN 6500)
ZXR10# #show dcnfib //Check the DCN routing table
total: 7
desip desipmask nexthopip mac slot p
ort
198.2.1.154 255.255.255.255 198.2.1.154 0000.0000.0000 0 0
198.2.1.155 255.255.255.255 198.2.1.155 00d8.8040.c00a 10 3
ZXR10#show dcnportstatus //DCN port status
DCN L2 port type: dcn_pos(N) - DCN POS interface, dcn_eth(N) - DCN ETHernet
dcn L3 interface ip address type: U - unnumbered, N - numbered
L3 cfg Base: B - L3 cfg base port, NB - L3 cfg not base port
ifname dcnl2port isenable/vlan/band/pri dcnl3port ip
addr/ipmask/remoteip/ospfarea/dt
xgei-1/10/0/1 dcn_pos2/down/NB enable/4094/10M/7 --- --
-/---/---/---/0
xgei-1/10/0/2 dcn_pos3/down/NB enable/4094/10M/7 --- --
-/---/---/---/0
xgei-1/10/0/3 dcn_pos4/up/NB enable/4094/10M/7 dcn_mcc66/up/U 19
8.2.1.154/255.255.255.255/198.2.1.155/2.0.0.0/3
ZXR10#show dcnl2 xgei-1/10/0/3 //Configurations of DCN port l2
interface portenable mac vlanid band
xgei-1/10/0/3 1 0000.0000.0000 4094 10
Clock Synchronization Troubleshooting
Analyze the clock fault
Procedure
1. Check whether the sites that extract the clock
from each other exist (current clock status)
2. Check whether active/standby clock sources
Check whether exist. If yes, use the customized SSM algorithm.
the clock forms 3. If a large amount of crystal oscillator aging or
loopback phase-locked loop alarms are raised and clock
loop needs to be removed, stop extracting line
clock on a site in the loop and use the local
oscillator clock.
Check whether the SSM algorithm on both ends

are consistent. Check whether one end is enabled
The clock cannot
and the other is not enabled.
trace the
If the external clock is extracted, check whether
upstream
the external clock needs to support SSM. If PPS
is extracted, check whether PPS has signal input.
Is the fault
eliminated?

End
personnel
Clock Module Diagnosis Commands
Ste[s Actions Sub-Actions
1 Failed to extract external The SSM byte formats are inconsistent. Set to not trust SSM
clock and synchronize the non-framing mode.
1588 frequency
2 The synchronous Ethernet Check whether the extraction priority is set between two-end
clock cannot be locked. nodes and whether to check SSM priority.
Check whether the port obtains ESMC packets
3 The1588 frequency Ensure that BMC priority is set properly on the two-end NEs.
restoration clock cannot be
locked or is frequently Ensure that the frequency for sending packet of MASTER and
unlocked SLAVE meets the requirements; ensure that packet sending
mode (layer 2 multicast/layer 3 multicast/layer 3 multicast) at
two-end NEs meet the equipment requirements.
In case of long-term lose lock, check whether the clock
subcard on slave is a Synchronization Supply Unit (SSU). The
node with 1588 frequency restoration needs to use the SSU
and the oscillator frequency deviation is +-0.05ppm.
4 A large amount of nodes It is generally caused by clock loopback. Change a core NE to
report clock source status local oscillator mode and check whether the fault is eliminated.
failure
Review
 How to handle L2VPN service interruption?

 How the handle L3VPN service interruption?
 How to fast locate faults on the EMS?
Example 1: Hybrid Dedicated Line Services
Instantaneously Interrupted
Backbo
PTN2-1
ne Ring Hong Si Food The people’s Duan
Men Bureau Hospital Qu
Router
HW93
06
About 50 Internet users
 A hybrid dedicated service is provided from PTN2 to each one of the four sites
in the chain (the central office has a VLAN. Access sites do not have VLAN
services.) The service ports on the central office are aggregate ports. Fault
phenomena: when users in the People’s Hospital use downloading tool, a large
amount of packets are lost. In addition, the four sites report tunnel Loss of
Connectivity (LOC) alarms and PW Service Signal Failure (SSF).
113
Example1: Hybrid Dedicated Service Interruption
 When XunLei (Fast Thunder, a download manager) is used for
download, the network traffic surges, indicating that faults are related
to network traffic. There are two probable causes:
 1 The port reaches the peak rate, causing congestion and packet loss;
 2 Because the speed limit is configured, if traffic exceeds the speed limit,
packets will be lost. Check whether a speed limit or shaping is configured
by viewing traffic threshold limit alarms or the configurations on all the
ports. It is found that the speed limit configured on the intermediate port
causes the fault.
 Summary:
 This is a typical fault and occurs only when the network traffic is large and
is caused by congestion packet loss. Congestion occurs in the following
two cases: 1 speed exceeds the port speed limit. 2 traffic exceeds the
speed limit when port and PW speed limit are configured.
114
Example 2- 2G Service Error of an Operator
 【Fault Phenomena】
 In the ZXCTN6200 equipment on Wudangxincun site of an operator, six 2M
services have errors. The blue line is the main link and the yellow line is the
standby link. Both the active link and the standby link have bit errors.
 【Location Procedure】
 Ensure that the CRC and optical power on the physical link are normal.
 Use the LM function and it is found that some packets are lost. The packet loss is
0.4%. The intermediate link may be faulty.
 Use the meter on site to perform the test. To send packets different from the on-
site service packets, the meter sends 5000 jumbo frames of 3000 bytes.
115
Example 2- 2G Service Error of an Operator
 【 Location Procedure 】
 Use the following commands to check whether 5000 jumbo packets of 3000 bytes are received and sent
properly. Check the nodes in the link one by one to ensure the node where the packets are lost.
 It is found that Xurijingcheng loop 13 receives 4987 jumbo frames of 3000 bytes and 13 packets are
lost. This link is faulty.
 Use the similar method to check the standby link.
GE44 is the internal

interface ID. In diagnosis
Port gei_3/2, whose logical port mode, run 107(B) (diag)
number “ge45”, is the port where #prjexec drv bcm-shell mp
“Mufushan DXC” panelinfoshow 3 //3 is the
and“Xurijingchen13”connect. The slot on the line card slot.
packet statistics shows that the
count of transceiving packets on
that port is consistent with the
count of packets sent by the
Smartbits meter.
Note: Port 4095 collects the
count of 3000 bytes. The services
on the current network rarely use
packets larger than 3000 bytes.
The packets of 5000 bytes are
collected to GR9216.
116
Example 2-2G Service Error of an Operator
 【Summary】
 This method is complicated and is used on site. There is another software
testing solution:
 Configure an EVPL service in VLAN ALL mode. Two devices are configured
in different VLANs. Terminal-side loopback is configured on the remote
equipment cip.
 Use the CPU tx command to replace the Smartbits meter to send packets.
Run the clear c command to count the chips of all ports in the link. Run the
clear c command in diagnosis mode instead of command line interface
mode.
 Send access packets corresponding to the VLAN to the CIP loopback port.
After loopback on the port, these packets are sent to the peer PE as service
packets. In version 1.1, enable the bcm diagnosis command switch. Run the
ssp_spy_safe_mode_en 0 command to use the tx command to send
packets. Port, length, VLAN, packet count can be specified. Run the p d b m
tx 1000 pbm=ge6 Length=3000 Vlantag=100 command to send 1000
packets whose length is 3000 bytes and VLAN is 100. After packets are sent,
run ssp_spy_safe_mode_en 1 to restore the bcm diagnosis command
switch.
 Check the port transceiving packet count in the entire link to determine the
port or link where packets are lost. The location ends and restore the onsite
configuration.
117
Example 3-Line Clock Alarm Raised on the
Wireless Side (TDM)
 Fault Phenomena :
 The wireless base station is upgraded from SDRV7.11.10.08P06 to
SDRV7.11.12.14P06. The “wireless line clock” alarms are raised on multiple sites.
Upgrade changes in wireless versions and wireless clock alarm mechanism: the new
version the SDR base station corrects code vulnerability of the earlier version. The line
clock fault alarm of the earlier version cannot be reported properly. The alarm is raised
after version update.
 PTN Handling and Analysis

 The probable causes of the clock fault include:
 The clock fault may be caused by network clock deployment fault.
 The link quality causes clock signal degradation.
 The clock board software or hardware fault of the PTN equipment is faulty.
 Locate the Faults:

 Ensure the network clock synchronization deployment schema. Ensure that
the OTN equipment supports network-wide clock synchronization.
 Ensure that the current network uses adaptive clock mode. In accordance
with planning recommendation and requirements, the network-wide clock can
be optimized as the sending clock.
 Check whether any link quality problem exists in the entire network. Check
whether CRC or any optical link alarm exists in the entire network through the
Example 3-Line Clock Fault Alarm
Reported on the Wireless Side
 Summary
 Both the adaptive clock and system clock of PTN E1 service can
meet the wireless network application requirements.
 Due to constraints of the implementation plan, the adaptive clock is
prone to be affected by network conditions, such as congestion,
delay and jitter, and it is sensitive to network packet loss. When
packets are lost in the NNI-side links, adaptive clock may be out of
lock, causing instable clock and large instantaneous jitter.
 Because the system clock is less affected by network quality, the
clock restoration frequency is stable, meeting network clock transfer
synchronization requirements. Before deployment of system clock,
ensure that the equipment, such as DWDM, supports clock transfer,
because incorrect system clock may cause more severe clock
quality faults.
Example 4- ZXCTN 6500-All Base Station Services in
Single Network are Instantaneously Interrupted in Tree
Service Model
 Fault phenomena:
 Tree service model : a UNI port in a leaf node forms self-loop. All services in the sub-network are
interrupted instantaneously. 核心网设备
PE2 PE1
PE4内部放大
L3VPN
VRF
PE3和PE4内部配置类似，区别
在于PE3的L3ULEI子接口 L3ULEI子接口实IP为100.92.3.2
实IP为100.92.3.3
PE3 PE4 VRRP虚IP为100.92.3.1
VRRP虚IP为100.92.3.1
桥接
L2ULEI子接口
Etree业务实例，L2ULEI
EVPtree 子接口为根，PW为叶子
PW1 PW2 PW3 PW4

 Note: 图中的文字翻译如下：(Core network equipment ）
 PE3和PE4内部…: PE3 and PE4 has similar internal configurations. The real IP address of the ULEI sub-interface on PE3 is 100.92.3.3, the virtual IP
of VRRP is 100.92.3.1 PW1 PW2 PW3 PW4
6000设备上虽然是1
 6000设备上：On the ZXCTN 6000, the 对1，但网管端到端
default end-to-end configuration on ZXCTN6000 is based on Tree mode. PW is the root and UNI port is the
leaf. 配置默认就是按照
Tree方式配置，即PW 6000-1 6000-2 6000-3 6000-4
 右侧： PE4内部放大 PE4 interior amplification
是根，UNI口是叶
 L3ULEI 子接口： L3ULEI sub-interface 子。
 实IP为100… : real IP is 100.92.3.2 and the virtual IP of VRRP is 100.92.3.1
 ETREE 业务实例： ETREE service instance, L2ULEI sub-interface is root, ans PW is leaf.
基站1 基站2 基站3 基站4
100.92.3.8 100.92.3.9 100.92.3.10 100.92.3.11
Example 4- All Services in Single Sub-Network are
Instantaneously Interrupted in Tree Service Model
 Fault Cause Analysis：
Leaf
正常 pw1
root 三层接口mac：a
Leaf L2 Ulei L3 Ulei
正常 pw2 TREE VRF
Leaf
环回 pw3
1.L3 Ulei送出一个到PW侧的报文，例如发送一个ARP请求报文或其他单播报文，报文源mac为L3
Ulei对应的MAC（MACa)
2.该报文经L2 Ulei后，从TREE业务实例中，如果发送到PW3,因PW3远端有环回，则PW3接收报文的源MAC
为MACa

3.TREEE业务实例中进行MAC学习，将MACa学习到L2
图中的文字：
Ulei端口上。此时，PW1/PW2发送到TREE实例的报

文，目的mac为MACa，经交换后能正常发送到三层接口。
1 L3 Ulei sends a packet to the PW side, for example, sends an ARP request packet or other unicast packet. The MAC of the packet source is the
4.该报文经L2 Ulei后，从TREE业务实例中，发送到PW3,PW3远端有环回，则PW3接收报文的源MAC为MACa
MAC(MACa) corresponding to L3 Ulei.
 2 If after this packet passes L2 Ulei, it is sent from the n the Tree service instance to PW3. The souce MAC of packets received by PW3 is MACa, because
5.TREE业务实际中进行mac学习，将MACa学习到PW3对应的虚端口上。
loopback exists in the remote end of PW3.
 6.此时，从PW1接收到报文，其净荷DMAC为MACa，正常情况下，应该查二层交换表，将报文发送到L2 Ulei to
3 MAC learning is implemented in the TREE service instance and MACa is learnt to port L2 Ulei. The destination MAC of the packet sent from PW1/PW2
the Tree instance is MACa. It can be switched to the layer 3 interface after being switched.
端口。但由于PW3的远端环回，导致报文查二层交换表后，将报文发送到了PW3端口，并由于PW1和PW3之间
 4 After passing L2 Ulei, this packet is sent from the Tree service instance to PW3. The souce MAC of packets received by PW3 is MACa, because loopback
的水平分割，将该报文丢弃。
exists in the remote end of PW3.
 5 TREE service implements MAC learning and the MACa is learnt to the virtual port corresponding to PW3.
 6 PW1 receives packets and payload DMAC is MACa. Check the L2 switching table to send the packets to L3 Ulei. Because of loopback on PW3, after he L2
switching table is viewed, the packet is sent to PW3 port and the pacekt will be dropped because of the horizontal division between PW1 and PW3.
Other Failures in Provisioning ZXCTN 6500LTE
 VLAN translation and configuration errors cause that the
ZXCTN 6500 equipment in the bridging node cannot ping
through base stations:
 Because the ZXCTN6500 is not configured with VLAN translation,
the VLAN sent from the ZXCTN6500 NNI is still that of service UNI
and it is dropped on the equipment.
 Handling methods: For the ZXCTN 6500, clear the “keep ingress
VLAN” check box. Ping the IP address of the base station. If you
can ping through the IP address of the base station, the service can
be provision properly.
 The PW control word of the access equipment is not
enabled globally, causing service failure:
 The PW control word of the access equipment is not enabled
globally, and the ZXCTN 6500 PW control word is enabled, but w-
control-word at both ends are inconsistent, causing packet failure.
The CIP and VIP learn incorrect MAC. The MAC address is
Common Tools and Function Description
 Common fault information collection tools and check tools:
 ZXCTN 9000 fault information collection tool
 ZXCTN 9000 enhanced equipment health check tool
 ZXCTN 6263 BSP engineering information collection tool
 DbCheck //Agent DB and platform database consistency check tool of a variety of
equipment
 ZXCTN6200\ZXCTN6300V2.0 information collection diagnosis tool
 ZXCTN 6263 equipment health check tool
 Common EMS Functions on Site (Refer to Routine Maintenance Manual) :
 The PW, CIP, tunnel interface traffic statistics function: enable the interface traffic statistics
function of the corresponding interface, and then query the traffic through the EMS or the CLI.---
The method for querying a tunnel on the U31 is used as an example: Right-click an NE. From
the shortcut menu, select NE Management > Service Management > MPLS-TP Tunnel
Configuration > Modify. In the Modify MPLS-TP Tunnel dialog box, enable the Traffic
Parameter . In the performance query area, you can query the tunnel traffic.
 The batch equipment alarm query function supports setting the alarm query template on the
EMS and querying whether chip, board, clock and other hardware alarms exist. On the U31,
select Fault > Alarm Monitoring > Custom Query > My Query > New. In the displayed dialog
box, enter the query conditions.
 The equipment performance query function can check important performance items on the
performance query template queried by the such as PWE3 and CRC performance. On the U31
select Performance -> Current Performance Data Query. Enter the query conditions and filter
NEs and performance items.
 Real-time traffic monitoring: if any performance problem occurs, use the real-time traffic

ZXCTNPTN Troubleshooting

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

ZXCTNPTN Troubleshooting

Hochgeladen von

Copyright:

Verfügbare Formate

ZXCTN\PTN Troubleshooting R2.

Training Materials (Level A)

ZXCTN CES Troubleshooting Manual

 At the end of this course, you will be able to

1. View end-to-end service alarms on the EMS;

 Chapter 1 Requirements for Troubleshooting

Collect information related to

Check the causes

Is the fault Contact technical support

 Line card startup or rack startup due to voltage

 Optical fiber physical interruption

 Equipment startup due to poor grounding

 Software running failure or active/standby tunnel

 Hardware line card or main control board failure

 Accidental deletion of services configured in the

No., service driver not loaded, L2VPN MTU configuration, lack of

 Service application scenario faults, including:

services, abnormal access mode selection; some protocol packets

Multiple services are

2. Use service extension information in the service manager (associated CIP/PW)

3. Query client-layer, service-layer, or local-layer alarms based on services

4. Locate services in the channel view

5. Query MAC learning of VFI services

6. Use the traffic statistics function

7. Fast query local-layer, client-layer, service-layer, history-layer, and pass through

8. Analyze the protocol link failures in history alarms

9. Handle the damaged or abnormal services

16. Use the traffic statistic function to analyze whether a service is

17. Fast query services based on traffic classification (VLAN).

 Chapter 1 Requirements for Troubleshooting

Does any important

Check the UNI optical

Is NNI-traffic Eliminate abnormal traffic like

If the CPU usage of the

 Chapter 1 Requirements for Troubleshooting

Replace boards Replace one main

Replace two main

Replace the whole

Check alarms requiring

Check performance tasks

The above alarms can be classified and processed

The above contents are involved in the performance template

CES performance degradation handling procedure

Static L2VPN Ethernet service interruption handling

MPLS-TP OAM troubleshooting procedure

Common Failure MPLS-TP tunnel protection, PW protection

Static L3VPN troubleshooting procedure (China Mobile

Monitoring MCC/DCN failure handling procedure

Clock/time failure handling procedure

Analyze the causes of the

Service layer (link

Classify and analyze

Contact technical support

Tunnel switchover failure causes service failure TMC-LOC

TMP/TMC-MNG The root alarm is LOC. If services are not affected,

Are service errors

Is HP_SLM or Yes High-order or Modify No Clear physical

Are service errors

Ensure that the platform is

Yes Collect information and No

Check Ensure that MLPPP connection Is the fault

Collect the fault

Downlink (Egress) includes two types of statistics:

Alarms on Alarms on Probable Causes Handling Methods