Sie sind auf Seite 1von 54

QoS Delivery/Methodology

Network Monitoring (October 2008) Nicolas Palumbo

1. 2. 3. 4.

QoS Reports Delivery Investigation Reports delivery Investigation methodology Migration Reports/templates delivery

5.
6. 7.

Migration methodology
QoS Alerter Investigation of problems

6.1 Bad Attach Setup Success Rate


6.2 WAC Abnormal Release/Handover exec fail 6.3 Handover Preparation failure

6.4 CPU Max = 100% on WAC


2 | Presentation Title | Month 2006 All Rights Reserved Alcatel-Lucent 2006, #####

QoS Reports Delivery

3 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Reports Delivery


All the reports present in this document can be imported in W3MR1ed2D. The files to import are delivered in the following link http://aww.quickplace.alcatel.com/QuickPlace/mnd_pcspsf/PageLibraryC12570D0006048F2.nsf/h_Index/34298E19D672E336C12574C20054A47 8/?OpenDocument&Form=h_PageUI
Concerning the QoS Report Delivery, it is in part III) Concerning the Migration Delivery, it is in part III bis)

Delivery of 4 reports permitting to follow the quality of service per WAC and available on commercial network
General remark: concerning the reports, when 2 scales are present, the scale on left part is dedicated to the columns and the scale on right part is dedicated to the lines Rem: it is an update of the reports done in W3MR1ed2b. So the report names are kept even if the new release is W3MR1ed2D.

4 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Reports Delivery


NP_Mono_W3MR1ed2b_1 to have a global behavior of the WAC: Complete Attach Setup procedure (applicable at BS level also): Ranging procedure Attach Setup success rate Attach Setup duration Session including (applicable at WAC level only): The maximum of simultaneous sessions The average duration of sessions Release causes (applicable at BS level also): Done by BS Done by WAC Data traffic (IP User Plan): Data traffic between BS and WAC (applicable at BS level also) Data traffic between WAC and CN (applicable at WAC level only) WAC capacities used (applicable at WAC level only): CPU (max and average) RAM (max and average)

5 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Reports Delivery


NP_Mono_W3MR1ed2b_2 for mobility: Handover Preparation success rate (applicable at BS level also) Intra-WAC Handover: Execution success rate (applicable at BS level also) Duration max and average (applicable at WAC level only) Inter-WAC Handover: Execution success rate (applicable at BS level also) Duration max and average (applicable at WAC level only)
NP_Mono_W3MR1ed2b_3 for VoIP and VoD (streaming) services (applicable at WAC level only): VoIP Setup procedure: Call Setup Success rate Call Setup Duration VoIP drop rate and release causes VoD Setup procedure: Call Setup Success rate Call Setup Duration VoD drop rate and release causes

6 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Reports Delivery


NP_Mono_traffic_and_radio for global BS behavior (applicable at BS level also) : Downlink: Number of slots used per modulation type Number of bytes sent per modulation type CINR distribution and percentage of CINR < 15db RSSI distribution and percentage of RSSI < -85dbm Uplink: Number of slots used per modulation type Number of bytes sent per modulation type Tx Power CINR distribution and percentage of CINR < 15db RSSI distribution and percentage of RSSI < -115dbm

7 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation Reports delivery

8 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation Reports delivery


Delivery of 3 reports permitting to establish a diagnosis on attach setup procedure failure or the abnormal releases and the actions to do
NP_Mono_attach_setup_WAC available at WAC level only showing the failure cases for each step of the attach setup procedure: Bad rate of ranging req compared to ranging CDMA (available at BS level also) T9 expiration: no SBC req after reception of ranging req (available at BS level also) Authentication failure (available at BS level also) PKM failure (available at BS level also) T17 expiration: no REG RSP sent the CPE failed in the attach procedure but any release done before the T17 expiration (available at BS level also) RAC rej Radio Admission Control. It means a BS has reached it maximum capacity in term of Service Flow or Service Flow & Bandwidth (depending on OMC configuration) (available at BS level also) %Attach BS fail: permit to show if the problem is before the attachment to the BS and so related to the BS (available at BS level also) %Fail after Attach to the BS: permit to show if the problem is after the attachment to the BS and so could be problem of DHCP, MIP or Diameter Relay Avg duration: if to high, it can be due to a long DHCP procedure (completely transparent to the WAC (available at BS level also) %DHCP or MIP fail All the release causes (available at BS level also)

9 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation Reports delivery


NP_attach_setup: warning report available at BS level.
This warning report permit to see the worst cells for the following cases: Bad rate of ranging req compared to ranging CDMA Connection setup succ rate Highest average duration T9 rate Authentication failure rate PKM failure rate T17 rate RAC rej rate Release cause other failure rate Release cause OVERLOAD rate Attach failure rate with all the causes displayed Attach failure rate before connection to the BS

NP_Mono_attach_setup: report same as NP_attach_setup but permitting to follow the BS


in time NP_Mono_HO_with_fail: report to investigate the Handover problems

10 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation methodology
The goal is from a global overview to go deeper in the analysis permitting:
to check the root cause of the problem to identify the entity concerned by the problem to take an action to solve/clarify the problem

The reports delivered concerned the attach setup procedure Run the report NP_Mono_W3MR1ed2b_1 with periodicity day If bad performance for attach setup procedure, for the concerned day, run the report NP_Mono_attach_setup_WAC with periodicity day. This will permit to fix if the problem is before/after the attachment to the BS. If before, you can see the view with reason of the failure If before attachment to the BS, run the warning report NP_attach_setup with periodicity day giving the day corresponding to the problem. Check for the concerned view, what is the worst cell. Ensure for the concerned cell you have enough samples (ex: if you have 100% failure with 2 request and 0 success, it is not meaningful) Run the mono report NP_Mono_attach_setup for the concerned day with periodicity 1/4. This permit to check if the problem is occasional of spread during all the day.

11 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation methodology
Re-Run the report NP_Mono_attach_setup for the previous days and the concerned day with periodicity 1/4 to check if the problem is periodic or episodic
On NPO, for the concerned view, right click and select properties. In some views, I added in the description field some comments permitting to check in priority some specific points

If not enough, the analysis above permit to say when and on which BS/WAC to start the trace
FOR HANDOVER

Run the report NP_Mono_W3MR1ed2b_2 with periodicity day


Run NP_Mono_HO_with_fail to identify the different failure cases If handover preparation failure case for the day concerned: select all the cells of the WAC and drag&drop of indicators HO_WAC_prep_req, _NP_BS_prep_fail, _NP_BS_prep_succ Check if the problem is identified on one or some cells or spread over all the cells

12 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation methodology
If handover execution failure case for the day concerned: select all the cells of the WAC and drag&drop of indicators HO_WAC_intraWAC_exec_req_sBS, HO_WAC_intraWAC_succ_sBS, NP_HO_WAC_intraW_sBS_fail Check if the problem is important on a site or spread over all the sites If problem on one site (for example), select object W-adjacency, select all the adjacencies related to this site and drag&drop of indicators HO_WAC_intraWAC_exec_req_sBS, HO_WAC_intraWAC_succ_sBS, NP_HO_WAC_intraW_sBS_fail you will have the list of the worst adjacencies

13 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration Reports /templates delivery

14 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration reports/templates
Delivery of 5 reports for migration
Migration_WAC mono-report available at WAC level to run before OMC migration. Migration_BS multi-report available at BS level only to run before OMC migration Migration_WAC to run after WAC migration T_Mono_Evolution_BS to run after BS migration for one particular BS when doubts T_Multi_Migration_BS to run on all the BS after migration

15 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration methodology

16 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration methodology
The migration is done in 2 part: OMC migration WAC/BS migration WAC/BS migration is usually done 2 or 3 days after OMC migration
During the OMC migration there is a risk to loose partially/completely the indicator database. It takes time to recover the database. During this period, we have no reference. So before OMC migration: Export the customers dictionary for report/indicators Run the Mono-Report Migration_WAC for each WAC: With periodicity week, for the 10 previous week With periodicity day, for the 21 previous days With periodicity hour, for the 7 previous days Run the Multi-Report Migration_BS for all the BS: With periodicity day for the 7 previous days Save the pm_xml files regularly before the OMC migration As the pm xml files are present during a period of 5 days, knowing the OMC migration date, the backup must be done regularly (at least during the 10 days before the migration). This files contain all the counters. So in case of missing information in the reports, it is possible to recover it in these files

17 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration methodology
After OMC migration: Check if NPO is always running If KO during a long time, need to work with pm xml files Check the customers views/reports/indicators are always present If no more present, re-import them Run the mono and multi reports (used before migration) for the different periodicities and with the same date to check they are equivalent If database corrupted, the comparison will be done with the reports done before the OMC migration

Before the WAC/BS migration:


Save all the OMC parameters If OMC migration done with success, the day before WAC/BS migration redo the same operations than before OMC migration for mono-report for multi-report, complete with the days missing

18 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration methodology
After the WAC/BS migration: First day, reports to run each hour: Mono report Migration_WAC starting from the previous complete days with periodicity hour (if possible with periodicity 1/4 but that means the Granularity Period is 15mn else no meaning). To have a good reference, you have to compare the previous hour with the same hour of the previous day (and if possible with the same hour and same day of the previous week) Multi-report T_Multi_Migration_BS with periodicity hour for the current day to check all the BS are running and generating traffic( attach setup attempt/success) Results sent as soon as BS not operational or important degradation Results of the previous hour with the cell migrated and taking off the bad BS compared to before migration(due to any reasons) to give the list of the bad BS (rem: can be done creating a cell zone)

19 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Migration methodology
Second day: Same as first day checking regularly if no problem seen before Compare the previous day with the same day of the previous week should be similar Results of the previous days for all the BS Results of the previous days taking off the BS bad compared to before migration(due to any reasons) to give the list of the bad BS (rem: can be done creating a cell zone)
Other days: Regular check for the current day Results of the previous days for all the BS Results of the previous days taking off the bad BS bad Follow of the bad BS if any action has been done First week: comparison of the previous week with the week before QoS follow up

20 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Alerter

21 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

QoS Alerter
Regarding QoS alerters, please check out the dedicated KTS session & respective slides on the Sleeping, Dead & Lazy cell topic: KTS QoS Alerter link Sleeping cell:
Description: BS-WAC link down Impact: will trigger a BS reset [the keep alive mechanism will not work] - all users on that BS will be affected being disconnected Workaround: use WBS monitoring tool to determine/measuring the impact

Dead cell:
Description: WBS strops to make traffic Impact: connected users will not make traffic anymore and new users can't connect (BS is not detected by the CPE) Workaround: QoS alerter to be used (usage depends on SW release) => if a BS is dead we have many steps as workaround: check status of the WBS and other alarms; check if the WBS is not having traffic because no CPEs on that area; if you find no reason for the BS no having traffic => Reset Telecom => if the problem still exists on the next GP, Reset WBS

Lazy cell:
Description: Traffic ongoing but no new CPE can connect Impact: Some mobiles can't connect into the WBS (this could impact all the mobiles or few mobiles only) Workaround: detection available in W3.1 Ed2D P2 only. Final QoS alerter formula is under test in a commercial network. As soon we will have the final formula with final threshold, we will announce it.
22 | Presentation Title | Month 2006 All Rights Reserved Alcatel-Lucent 2006, #####

INVESTIGATION OF PROBLEMS

23 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Bad Attach Setup Success Rate

24 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate


Running report NP_Mono_W3MR1ed2b_1, the connection setup success rate is very low (<1%)

Attach Setup Success Rate - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) 2008 Week 23 To 2008 Week 28 (Working Zone: Global - Medium)
16000 14000 12000 10000 0.8% 0.7% 0.6% 0.5% 0.4% 0.3% 0.2% 0.1% 0.% 06/02/2008 06/09/2008 06/16/2008 06/23/2008 06/30/2008 07/07/2008

No unit

8000 6000 4000 2000 0

setup succ %connection setup succ

25 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate


Running report NP_Mono_attach_setup_WAC with periodicity day:
view %Attach BS fail permit to say the attach setup failure occurs before the attachment to the BS The listed failure causes rate are near or equal to 0% (T9, T17, auth_fail, pkm fail) The view displaying all the release causes shows a high level of release cause other
Attach fail before BS attached - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) - 07/06/2008 To 07/15/2008 (Working Zone: Global - Medium)
350000 300000 99.% 250000 98.5% 99.5%

No unit

200000 150000 100000

setup req %attach BS fail

98.% 97.5%

50000 0 97.%

26 | Presentation Title | Month 2006

07 /0 6/ 20 08 07 /0 7/ 20 08 07 /0 8/ 20 08 07 /0 9/ 20 08 07 /1 0/ 20 08 07 /1 1/ 20 08 07 /1 2/ 20 08 07 /1 3/ 20 08 07 /1 4/ 20 08 07 /1 5/ 20 08
All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate

Attach fail at BS level - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) 07/06/2008 To 07/15/2008 (Working Zone: Global - Medium)
350000 300000 250000 80.00%
No unit

120.00%

100.00%

overlaod other fail preProv SF fail PKM exchange fail


%

200000 60.00% 150000 40.00% 100000 50000 0


07 /0 6/ 20 08 07 /0 7/ 20 08 07 /0 8/ 20 08 07 /0 9/ 20 08 07 /1 0/ 20 08 07 /1 1/ 20 08 07 /1 2/ 20 08 07 /1 3/ 20 08 07 /1 4/ 20 08 07 /1 5/ 20 08

auth fail setup succ %attach fail %T17 expired %RAC rej

20.00%

0.00%

27 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

No unit
C el l_ P P

70000

28 | Presentation Title | Month 2006


10000 20000 30000 40000 50000 60000 70000 80000 0

Investigation of problems Bad attach setup success rate

Cell RESTSRIRATU2
setup req %other fail

Attach_setup_req=73374

Running the warning report NP_attach_setup with periodicity day: In view Attach fail cause other, we took the first worst cell (containing many attempts)

Attach fail cause other - BSCELL - 07/15/2008 (Working Zone: Global - Medium)

All Rights Reserved Alcatel-Lucent 2006, #####


00.% 20.% 40.% 60.%

R N C EG el E l_ R R IS E E S M T C S R BIL C e I A el ll_ R N l D A C _G B TU 3 el E K 2 l_ TA LS C HO H 21 el T A S l C _D E L LI el A T 2 C l_H TO UN el O S E l_ T E 2 D C N el C A E L A e l_ l NA T 2 U C H l_ U N el U G K E l_ PH E O 3 H O TA T U E H A C P el C H JL AS 3 l_ el O N L P l_ E T I1 P S JL A R U N R N B T K E A A L C G E NG RK 3 el R H L l_ IS I 1 T W C IS EM E C C C ell_ M BI 1 el e W AK LA l_ ll_ A G N P W K M 1 P R I IL B C SE SM MA 3 el T A S l_ A K 2 C SU P A GM el B K B C el C l_T AN JA 1 l_ e M G Y JL ll N H A N _T S IT 3 P M E E A N T C H S A 2 C A E P el N T AK C l_ el KL C GS A P 3 l_ I e E A JL N ll T K N IKS _A AP 2 P E L A A R C HA G A A K 2 el N R JH l_ G A I2 C GO S SS el M E 1 l_ B TA 4 S A P 1 C UB KJ A el A A K l_ N Y 1 K G A L H 1 C I TL TE O C D 3 G E 3


80.% 100.% 120.%

100%

Investigation of problems Bad attach setup success rate


Running the mono report NP_Mono_attach_setup with periodicity 1/4 for the concerned day: In view Attach fail at BS level, we see a level of release cause other spread all the time. That means the problem is permanent
Attach fail at BS level - BSCELL: Cell_RESTSRIRATU2 ( 000012a00132 ) 07/15/2008 00:00 To 07/16/2008 00:00 (Working Zone: Global - Medium)
1000 900 800 700 100.00% 80.00% 60.00% 40.00% 20.00% 0.00% overlaod other fail preProv SF fail PKM exchange fail 120.00%

No unit

600 500 400 300 200 100 0

auth fail setup succ %attach fail %T17 expired %RAC rej

29 | Presentation Title | Month 2006

07 /1 07 5 /2 /1 0 0 07 5 /2 8 0 /1 0 0 0: 07 5 /2 8 0 00 /1 0 0 1: 07 5 /2 8 0 15 /1 0 0 2: 07 5 /2 8 0 30 /1 0 0 3: 07 5 /2 8 0 45 /1 0 0 5: 07 5 /2 8 0 00 /1 0 0 6: 07 5 /2 8 0 15 /1 0 0 7: 07 5 /2 8 0 30 /1 0 0 8: 07 5 /2 8 1 45 /1 0 0 0: 07 5 /2 8 1 00 /1 0 0 1: 07 5 /2 8 1 15 /1 0 0 2: 07 5 /2 8 1 30 /1 0 0 3: 07 5 /2 8 1 45 /1 0 0 5: 07 5 /2 8 1 00 /1 0 0 6: 07 5 /2 8 1 15 /1 0 0 7: 07 5 /2 8 1 30 /1 0 0 8: 07 5 /2 8 2 45 /1 0 0 0: 07 5 /2 8 2 00 /1 0 0 1: 5 / 8 15 20 2 08 2:3 23 0 :4 5
All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate


Checking the view NP_102_Attach_Fail_Other, the description indicates that if other cause is very high, the problem can occur from a bad anonymous identity.
Running the report NP_Mono_attach_setup_WAC with periodicity hour during several day, we have seen the problem is during the day and during the night. A wireshark trace has been done during the night. After analysis of the trace (mainly focused on ss_data_ind that contained the anonymous identity), we have seen a bad anonymous identity anonymous identity without extension @P1 (find enclosed a screen shot of the wireshark trace on next slide)

30 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate

31 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Bad attach setup success rate


After analysis of all the trace during a period of 1 hour (around 6GB) a list of MAC address with bad anonymous identities has been established.
3 CPEs of the list (the worst) have been switched off/on (unplug/plug) and after that, the CPEs were working fine That means the problem is not due to a bad configuration but to a bad behavior of the CPEs. Probably this problem will re-appear again on the same

CPEs or other CPEs

As it is not possible to systematically make wireshark trace to have a list of MAC address with bad anonymous identity, another solution is to use the WAC log file CC_callControlThread_x.log. On WAC machine, you have to collect all the CC_callControlThread_x.log going in /diagnosis/log/WUM_<PID>_<date>>/CC_<PID> with <date> corresponding to the more recent date. A tool is in preparation to give the list of MAC address with bad anonymous identity and the corresponding cell When the tool will be available, this check must be done each day to verify if new CPEs will have bad anonymous identity

32 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

WAC Abnormal Release /Handover exec fail

33 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


Running report NP_Mono_W3MR1ed2b_1 on WACLU01 with periodicity week on several weeks, there is a number of WAC abnormal release increasing

Session Release - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) - 2008 Week 23 To 2008 Week 28 (Working Zone: Global - Medium)
16000 14000 12000 10000 16000 14000 12000 Rel other 10000 WAC Abnormal Rel

No unit

No unit

8000 6000 4000 2000 0 06/02/2008 06/09/2008 06/16/2008 06/23/2008 06/30/2008 07/07/2008

8000 6000 4000 2000 0

BS Rel WAC Normal Rel MS Rel Session Ended

34 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


Running report NP_Mono_W3MR1ed2b_2 on WACLU01with periodicity week on several weeks, there is: high number of handover execution (for intra WAC handover -inter WAC low) Increase of handover execution failure

Rem: you can see a case with %HO succ > 100%. It is under investigation. If you see similar anomaly, let me know (Nicolas Palumbo) to be able to investigate the problem
INTRA WAC HO Exec Proc - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) 2008 Week 23 To 2008 Week 28 (Working Zone: Global - Medium)

50000 45000 40000 35000 30000

160.% 140.% 120.% 100.% 80.% 60.% 40.% 20.% .% 06/02/2008 06/09/2008 06/16/2008 06/23/2008 06/30/2008 07/07/2008

No unit

25000 20000 15000 10000 5000 0 -5000

HO fail HO succ %HO succ

35 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


We can observe a correlation between the number of intra-WAC handover execution failure with the number of WAC abnormal release. On week 25, no handover failure and no WAC abnormal release. We can see also that the number of Release other is increasing in the same rate than the WAC abnormal release (but the number of release others is lower) To confirm the correlation between handover and release: need to create a multi-report containing the intra-WAC handover execution req, success, failure, wac abnormal release and release other causes. Applying this multi-report on all the cells belonging to WACLU1, we have 4 sites generating 95% of handover, 95% of WAC abnormal release and a big part of Release cause other Moreover, due to the fact the network is mainly composed of indoor CPEs, it is strange to have so many handovers. Maybe there is a ping-pong effect To confirm that, a trace has been done (next slide). In this trace, in some seconds, you have several handovers At the end, the handover failed (No 2869) and the session is released (No 2871) For this case, the WAC abnormal release indicator is incremented

36 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail

37 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


We can see another cause of release done after handover. After several handovers success (so no failure), the CPE sends a DHCP Release (No 1774) and the WAC release the session (No 1775). The DHCP release sent by the CPE is seen by the WAC as an abnormal release. The indicator MS_rel_other (in legend Rel other) will be incremented. So it is important to track CPEs that are generating DHCP release if DHCP log file exists, it can give the MAC address of these CPEs

38 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail

39 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


To reduce the ping-pong effect and so to reduce the number of handovers, the following actions have been done the 22/07/2008, On site TMNSETAPAK, the adjacencies have been removed The hard Hysterisis Margin handover parameter was 2 and has been set to 5 on the following sites PPRGBONUS TUNEHOTEL DANAUKOTA
Running the report NP_Mono_W3MR1ed2b_1 and NP_Mono_W3MR1ed2b_2 with periodicity day for the previous week, we can see an improvement of the main KPIs Max Simultaneous connections ==> much better (x by 2) Average duration session ==> much better (x by 2) Traffic User Plan IP between WAC-BS ==> much better (x by 2) Traffic User Plan IP between WAC-CN ==> much better (x by 2) Handovers (see NP_Mono_W3MR1ed2b_2_day.xls) Number of Handovers preparation/executions seriously decreased - divided by 4 Number of Handover execution failure rate has dropped from 90% to 50% ==> we have to fix on which cell(s) the problem occurs The number of WAC abnormal release has decreased but not as expected because handovers are performed and many handover failures are present
40 | Presentation Title | Month 2006 All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


Session - WAC: waclu01 - 07/16/2008 To 07/23/2008
4000 3500 3000
No unit

Session Duration - WAC: waclu01 - 07/16/2008 To 07/23/2008


25000000

Start Max (simultaneously opened)

140 120 100 80 60


No unit

Total Duration Avg Duration

7000 6000 5000

20000000 15000000
s

2500 2000 1500 1000 500 0


07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08

4000 3000 2000


s

10000000

40 20 0
5000000 0
07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08

1000 0

UP IP WAC-BS - WAC: waclu01- 07/16/2008 To 07/23/2008


1.4E+11 1.2E+11 1E+11
bytes

UP IP WAC-CN - WAC: waclu01- 07/16/2008 To 07/23/2008


1.6E+11 1.4E+11

WAC to BS in bytes BS to WAC in bytes WAC to BS in bps BS to WAC in bps

14000000 12000000 10000000


bps

1.2E+11 1E+11
bytes

WAC to CN in bytes CN to WAC in bytes WAC to CN in bps CN to WAC in bps

14000000 12000000 10000000


bps

8E+10 6E+10 4E+10 2E+10 0


07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08

8000000 6000000 4000000

8000000 6000000 4000000 2000000 0

8E+10 6E+10 4E+10

2000000 0

2E+10 0
07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08

41 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


Session Release - WAC: waclu01- 07/16/2008 To 07/23/2008 Rel other WAC Abnormal Rel 4000 BS Rel WAC Normal Rel 3500 MS Rel 3000 Session Ended
No unit

4000 3500 3000


No unit

20000 18000 16000 14000


No unit

Handover Preparation - WAC: waclu01- 07/16/2008 To 07/23/2008 Prep fail Prep succ %Prep fail 80.00% %RAC Rej
70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00% 0.00% 07/16/200807/17/200807/18/200807/19/200807/20/200807/21/200807/22/200807/23/2008
%

2500 2000 1500 1000 500 0


07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08

2500 2000 1500 1000 500 0

12000 10000 8000 6000 4000 2000 0

We can see a small reduction of WAC abnormal release and a big decrease of %HO succ.
The next step is to see if it is due to one cell or if it is spread over all the cells.
No unit

8000 7000 6000 5000 4000 3000 2000 1000 0

INTRA WAC HO Exec Proc - WAC: waclu01- 07/16/2008 To 07/23/2008 HO fail HO succ %HO succ 100.%
90.% 80.% 70.% 60.% 50.% 40.% 30.% 20.% 10.% .% 07/16/2008 07/17/2008 07/18/2008 07/19/200807/20/2008 07/21/2008 07/22/2008 07/23/2008
%

42 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems WAC Abnormal Release/Handover Exec Fail


HO_WAC_intraWAC_succ_sBS _NP_HO_WAC_intraWAC_sBS_fail HO_WAC_intraWAC_exec_req_sB S MS_rel_WAC_abnormal

07/23/2008

On BS Cell object, select the 3 sites with modification of hard HO hysterisis margin and check the indicators relative to HO and release. We can see the main problem is on site DANAUKOTA. Now we have to select object W-adjacency and select all the adjacencies related to DANAUKOTA site (so 1,2,3)
HO_WAC_intraWAC_succ_sBS
3 2 9 7 4 0 0 0

113 293 262 280 83 96 94 7

20 0 221 247 69 80 70 7

93 293 41 33 14 16 24 0

15 11 35 41 52 77 17 1

26 81 25 29 4 16 9 0

07/23/2008

1241 719 522 1460 791 669

642 762

231 732

Conclusion: the HO failures are mainly due to the adjacency: DANAUKOTA3-DANAUKOTA1 DANAUKOTA2-DANAUKOTA1 DANAUKOTA2-PPRSGBONUS1
43 | Presentation Title | Month 2006

HO_WAC_intraWAC_exec_req_sBS

Cell_DANAUKOTA1-Cell_DANAUKOTA2 Cell_DANAUKOTA1-Cell_DANAUKOTA3 Cell_DANAUKOTA2-Cell_DANAUKOTA1 Cell_DANAUKOTA2-Cell_DANAUKOTA3 Cell_DANAUKOTA2-Cell_PPRSGBONUS1 Cell_DANAUKOTA3-Cell_DANAUKOTA1 Cell_DANAUKOTA3-Cell_DANAUKOTA2 Cell_DANAUKOTA3-Cell_TMNSETAPAK2

6 7 56 7 50 274 10 9

274 10 9

All Rights Reserved Alcatel-Lucent 2006, #####

_NP_HO_WAC_intraWAC_sBS_fail
3 5 47 0 46

Cell_DANAUKOTA1 Cell_DANAUKOTA2 Cell_DANAUKOTA3 Cell_HOTELTUNE1 Cell_HOTELTUNE2 Cell_HOTELTUNE3 Cell_PPRSGBONUS1 Cell_PPRSGBONUS2 Cell_PPRSGBONUS3 Sum of the 3 sites WACLU1

13

393

MS_rel_other
41

Investigation of problems WAC Abnormal Release/Handover Exec Fail


Now, as the problem has been fixed on the adjacencies, drive tests must be done on DANAUKOTA3-DANAUKOTA1 to understand why handover are 100% failed. Action 1- On WACLU01, make WireShark trace during a GP with high number of handover failure Action 2- CCC Filter trace on BS ID corresponding to DANAUKOTA3 as source or target you will have all related to this BS. You can check the MAC address attempting many handover with failure cases Action 3- Find the physical position of the CPE with the MAC address identified. Action 4- Make drive tests to check if problem similar. If not, CPE must be traced and if necessary, BS DANAUKOTA3 and DANAUKOTA1 must be traced Action 5- Analysis of traces if necessary

44 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Handover Preparation failure

45 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Handover Preparation/Failure


Running report NP_Mono_W3MR1ed2b_2 on WACLU01with periodicity week on several weeks, there is: high number of handover preparation request high number of handover preparation failure
Handover Preparation - WAC: waclu01 ( RanCtrl-01-Cluster01.packet1.com ) - 2008 Week 23 To 2008 Week 28 (Working Zone: Global - Medium)
100000 90000 80000 70000 90.00% 80.00% 70.00% 60.00% Prep fail

No unit

60000 50000 40000 30000 20000 10000 0 06/02/2008 06/09/2008 06/16/2008 06/23/2008 06/30/2008 07/07/2008

50.00% 40.00% 30.00% 20.00% 10.00% 0.00%

Prep succ %Prep fail %RAC Rej

46 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Handover Preparation/Failure


WAC in Release W3 doesnt support the handover during attach setup procedure
Concerning the high number of handover preparation, the following view shows that HO_BS_fail_empty_BSlist_sBS is very high. One cause is when a CPE is requesting a handover preparation during attach setup procedure, as WAC W3 doesnt support HO during this phase, the WAC rejects this request and the BS sends the message MOB_BSHO-RSP cause empty BS list to the CPE indicator HO_BS_fail_empty_BSlist_sBS is incremented. After the reject, the CPE re-attempts and is rejected and re-attempts etc
Handover Preparation - WAC: waclu01- 07/16/2008 To 07/23/2008
20000 18000 16000 14000 50.% 40.% 30.% 20.% 60.%

No unit

12000 10000 8000 6000 4000 2000 0 0.%

Prep ko RTVR flush expired Prep ko NRT flush expired Prep ko empty list
%

Prep succ %Prep succ

10.%

47 | Presentation Title | Month 2006

07 /1 6/ 20 08 07 /1 7/ 20 08 07 /1 8/ 20 08 07 /1 9/ 20 08 07 /2 0/ 20 08 07 /2 1/ 20 08 07 /2 2/ 20 08 07 /2 3/ 20 08
All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems Handover Preparation/Failure


Logically if the CPE is on the best cell, there is not reason to ask for a handover on a better cell. This problem occurring many times, that means the CPE doesnt select the best cell before the attach setup procedure
Checking the BS cells with this problem, the worst cells are:
07/23/2008

Cell_DANAUKOTA1 Cell_DANAUKOTA2 Cell_DANAUKOTA3 Cell_HOTELTUNE1 Cell_HOTELTUNE2 Cell_HOTELTUNE3 Cell_PPRSGBONUS1 Cell_PPRSGBONUS2 Cell_PPRSGBONUS3

Due to the fact the CPE is not on the good cell, the ranging req vs cdma rate is directly impacted (poor coverage and so many ranging CDMA sent)

48 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

HO_WAC_prep_re q _NP_BS_prep_fail _NP_BS_prep_suc c


17 231 890 846 949 228 151 104 7 4 118 597 584 669 145 55 10 0 13 113 293 262 280 83 96 94 7

Investigation of problems Handover Preparation/Failure


First action is to make a WireShark WAC trace to see which CPE (MAC address) is asking many handover preparation during attach setup procedure
Second action is to make a Radio Trace to follow this MAC address and to check if it is on the best BS Cell or not

49 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

CPU Max = 100% on WAC

50 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems CPU Max = 100% on WAC


Checking the CPU view with periodicity day for a duration of one week, we have:

CPU and RAM used - WAC: waclu01- 07/16/2008 To 07/23/2008


120.00% 100.00% 80.00% 20.% 18.%

CPU_Max RAM_Max CPU_Avg RAM_Avg

16.% 14.% 12.% 10.% 8.% 6.% 4.% 2.%

60.00% 40.00% 20.00% .00%

.% 07/16/200807/17/200807/18/200807/19/200807/20/200807/21/200807/22/200807/23/2008

The 21/07/2008
51 | Presentation Title | Month 2006 All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems CPU Max = 100% on WAC


Checking the CPU view for the 21/07/2008 with periodicity 1/4, we have:
CPU and RAM used - WAC: waclu01- 07/21/2008 14:00 To 07/21/2008 17:00
120.00% 100.00% 80.00% 20.0% 18.0% 16.0% 14.0% 12.0% 10.0% 8.0% 6.0% 4.0% 2.0% .0%

60.00% 40.00% 20.00% .00%

CPU_Max RAM_Max CPU_Avg RAM_Avg

The 21/07/2008 at 15h58, WAC log traces have been collected on WACLU01. So USB key has been plugged and the WAC has transferred around 500MB of log file. It has taken around 30mn Doing actions on WAC can increase the CPU and so to have a direct influence on network behavior

52 | Presentation Title | Month 2006

07 /2 1/ 20 08 07 14 /2 1/ :0 20 0 0 07 8 14 /2 1/ :1 20 5 08 07 /2 14 1/ :3 20 0 08 07 14 /2 1/ :4 20 5 0 07 8 15 /2 1/ :0 20 0 0 07 8 15 /2 1/ :1 20 5 08 07 /2 15 1/ :3 20 0 08 07 15 /2 1/ :4 20 5 08 07 16 /2 1/ :0 20 0 0 07 8 /2 16 1/ :1 20 5 0 07 8 16 /2 1/ :3 20 0 0 07 8 16 /2 1/ :4 20 5 08 17 :0 0

All Rights Reserved Alcatel-Lucent 2006, #####

Investigation of problems CPU Max = 100% on WAC


Coincidence or not: the save of log files started at 15h58 and ended around 16h28: the WAC01 didnt sent the counters (so in the previous view hole for the period 15h45-16h00 In the following graphs, you can see an increase of the number of max simultaneous session followed by an increase of release cause Rel Other Next time, when actions are done on WAC, the 3 previous points must be rechecked + the CPU to 100%
Session - WAC: waclu01- 07/21/2008 14:00 To 07/21/2008 17:00
80 70 60
No u nit

Session Release - WAC: waclu01- 07/21/2008 14:00 To 07/21/2008 17:00


90 80 70 60 50 40 30 20 10 0

Start Max (simultaneously opened)

70 60 50
No u nit
No u nit

Rel other WAC Abnormal Rel BS Rel WAC Normal Rel MS Rel

40 30 20 10 0
07 /21 / 07 20 0 81 /21 4:0 07 /20 0 0 /21 8 /20 14:1 07 /21 08 1 5 4:3 / 07 20 0 81 0 /21 4:4 / 07 20 0 81 5 /21 5:0 / 07 20 0 81 0 /21 5:1 07 /20 0 5 /21 8 /20 15:3 07 /21 08 1 0 5:4 / 07 20 0 81 5 /21 6:0 / 07 20 0 81 0 /21 6:1 07 /20 0 5 /21 8 /20 16:3 07 /21 08 1 0 6: /20 08 45 17 :00

40 30 20 10 0

53 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

07 /21 /20 07 08 /21 14 :00 /20 08 07 /21 14 :15 /20 07 08 /21 14 :30 /20 07 08 /21 14 :45 /20 08 07 /21 15 :00 /20 07 08 /21 15 :15 /20 07 08 /21 15 :30 /20 08 07 /21 15 :45 /20 07 08 /21 16 :00 /20 07 08 /21 16 :15 /20 08 07 /21 16 :30 /20 07 08 /21 16 :45 /20 08 17 :00

Session Ended

No u nit

50

90 80 70 60 50 40 30 20 10 0

www.alcatel-lucent.com

54 | Presentation Title | Month 2006

All Rights Reserved Alcatel-Lucent 2006, #####

Das könnte Ihnen auch gefallen