Sie sind auf Seite 1von 8

PCI Conflict and RSI Collision Detection in LTE

Networks Using Supervised Learning Techniques


R. Verı́ssimo1 P. Vieira2,3 A. Rodrigues1,3 and M. P. Queluz1,3
1 2 3
Instituto Superior Técnico Instituto Superior de Engenharia de Lisboa Instituto de Telecomunicações
University of Lisbon, Portugal Instituto Politécnico de Lisboa, Portugal Lisbon, Portugal
Email:rodrigo.verissimo@ist.utl.pt Email:pvieira@deetc.isel.pt Email:[ar, paula.queluz]@lx.it.pt

Abstract—This work tests three hypotheses regarding how is to test three hypotheses regarding how well two distinct
well two distinct Long Term Evolution (LTE) network problems LTE network problems can be detected through supervised
can be detected through supervised techniques with near real techniques with near real time performance.
time performance. The tested network problems are Physical
Cell Identity (PCI) conflicts and Root Sequence Index (RSI) As this work aims to create models for near real time
collisions and were labeled through configured cell relations that detection of PCI conflicts and RSI collisions, thus the popular
verified these two conflicts. Furthermore, a real LTE network k-Nearest Neighbors (K NN) with Dynamic Time Warping
was used. The obtained results showed that both problems (DTW) classification approach was not tested [1]. The reason
were best detected by using each Key Performance Indicator for this decision was based on the fact that it is computation-
(KPI) measurement as an individual feature. The highest average
Precision obtained for PCI conflict detection was 31% and 26% ally intensive and very slow for large data sets as is the case
for the 800 MHz and 1800 MHz frequency bands, respectively. of this work.
The highest average Precision obtained for RSI collision detection In order to automatically detect the network fault causes,
was 61% and 60% for the 800 MHz and 1800 MHz frequency some work as been done by using KPI measurements with
bands, respectively. unsupervised techniques, as in [2].
This work is organised as follows. Section II introduces the
Keywords: Wireless Communications, LTE, Machine analysed network problems, namely PCI conflicts and RSI
Learning, Classification, PCI Conflict, RSI Collision. collisions. Section III presents the chosen KPIs and Machine
Learning (ML) models, the two proposed hypotheses and
I. I NTRODUCTION describe how the obtained models were evaluated. Section IV
Two of the major concerns of Mobile Network Operators presents the obtained results. Finally, conclusions are drawn
(MNO) are to optimize and to maintain network performance. in Section V.
However, maintaining performance has proved the be a chal-
II. A NALYSED N ETWORK P ROBLEMS
lenge mainly for large and complex networks. In the long term,
changes made in the networks may increase the number of A. Physical Cell Identity Conflict
conflicts and inconsistencies that occur in them. These changes Each LTE cell has two identifiers with different purposes –
include changing the tilting of antennas, changing the cell’s the Global Cell Identity (ID) and the PCI. The Global Cell ID
power or even changes that cannot be controlled by the MNOs, is used to identify the cell from an Operation, Administration
such as user mobility and radio channel fading. and Management (OAM) perspective. The PCI is used to
In order to assess the network performance, quantifiable scramble the data in order to aid mobile phones to separate
performance metrics, known as Key Performance Indicators information from different transmitters [3]. As a LTE network
(KPI), are typically used. KPIs can report network perfor- may contain a much larger number of cells than the 504
mance such as the handover success rate and the channel inter- available values of PCIs, the same PCI must be reused
ference averages of each cell, and are calculated periodically, by several cells. However, the User Equipment (UE) cannot
resulting in time series. A time series can be either univariate distinguish between two cells if they both have the same PCI
or multivariate. As this study uses data samples that represent and frequency, a situation called as PCI conflict.
LTE cells with several measured KPIs, then the data consists PCI conflicts can be divided in two cases – PCI confusions
of multivariate time series. and PCI collisions; PCI confusions occur whenever a LTE
This work focuses on applying supervised techniques for cell has two different neighbor LTE cells with equal PCI, in
detecting two known LTE network conflicts, namely Physical the same frequency band [4]; PCI collisions happen whenever
Cell Identity (PCI) conflicts and Root Sequence Index (RSI) a LTE cell has a neighbor LTE cell with identical PCI in the
collisions. The used labeling was only possible due to a same frequency band [4].
CELFINET product that allows to obtain cell relations which A good PCI plan can be applied to avoid PCI conflicts.
label the two mentioned network conflicts; also real data However, it can be difficult to do such a plan without getting
obtained from a LTE network was used. The aim of this work any PCI conflicts in a dense network. Moreover, network
changes, namely increased power of a cell and variable radio • UL PUSCH Interference Avg - the average measured
conditions, can lead to PCI conflicts. PCI conflicts can lead interference in the Physical Uplink Shared Channel;
to a dropped call rate increase due to failed handovers, as well • Service Establish - the amount of established service
as an increase of blocked calls and channel interference [4]. connections;
• Service Drop Rate - the ratio of the dropped service
B. Root Sequence Index Collision occurrences;
A UE has to perform the LTE random access procedure to: • DL Avg Cell Throughput Mbps - the average measured
connect to a LTE network; establish or re-establish a service cell downlink throughput in Mbit/s;
connection; perform intra-system handovers; synchronize for • DL Avg UE Throughput Mbps - the average measured
uplink and downlink data transfers. The LTE random access UE downlink throughput in Mbit/s;
procedure can be performed using two different solutions: • DL Latency ms - the average duration an Internet Pro-
by allowing non-contention based and contention based. A tocol Internet Protocol (IP) packet takes since being sent
LTE cell uses 64 Physical Random Access Channel (PRACH) by the UE until reaching back to it;
preambles, where 24 of those preambles are reserved to by the • RandomAcc Succ Rate - the success rate of established
evolved NodeB (E NB) for non-contention based access, and services made through the Random Access Channel
the remaining 40 preambles are randomly selected by the UEs (RACH);
for contention based access [3]. • IntraFreq Exec HO Succ Rate - the success rate of pro-
The 40 PRACH preambles that a UE can use are calculated cessed handovers between cells operating in the same
by the UE through the RSI parameters that the LTE cell trans- frequency band;
mits in the System Information Block 2 (SIB2) through the • IntraFreq Prep HO Succ Rate - the success rate of the
PRACH [5]. Whenever two or more neighbor cells operate in handover preparation between cells operating in the same
the same frequency band and have the same RSI parameter, it frequency band.
results in the connected UEs to calculate the same 40 PRACH To detect RSI collisions, a subsection of the aforementioned
preambles, increasing the occurrence of preamble collisions. KPIs were selected, namely:
The aforementioned problem is known as RSI collision and
can lead to an increase of failed service establishments and • UL PUCCH Interference Avg;
re-establishments, as well as an increase of failed handovers. • UL PUSCH Interference Avg;
• Service Establish;
III. M ETHODOLOGY • IntraFreq Exec HO Succ Rate;
This study was performed using real data from a LTE • IntraFreq Prep HO Succ Rate;
network of a MNO. Furthermore, data was collected for the • RandomAcc Succ Rate.
same weekday of three consecutive weeks, for every period of After discarding cells with high null KPI measurements
15 minutes, resulting in a daily total of 96 measurements. and interpolating those of the remaining cells, it was decided
Using a CELFINET tool, it was possible to label cells to separate the data into different frequency bands, namely
that had PCI conflicts and/or RSI collisions. Source cells the 800 MHz and 1800 MHz bands. The 2100 MHz and
that had configured neighbor cells with equal PCI, in the 2600 MHz frequency bands were not considered, as they
same frequency band, were labeled as having a PCI collision. represented only 9% of the data, and had few occurrences
Source cells that had two or more neighbor cells with equal of PCI conflicts and RSI collisions. This decision to separate
PCI in the same frequency band between themselves were the data into different frequency bands was taken in order
labeled as having a PCI confusion. Source cells that had to create frequency dependent models, as different frequency
neighbor cells with equal RSI in the same frequency band, bands have different purposes.
were labeled as having a RSI collision. Cells that did not The cleaned data for PCI conflict detection in the 800
present any of these conflicts were labeled as nonconflicting. MHz frequency band consisted of 8666 nonconflicting cells,
1551 PCI confusions and 6 PCI collisions. The 1800 MHz
A. Proposed Key Performance Indicators frequency band data had 16675 nonconflicting cells, 1294 PCI
The first step involved in collecting a KPI list of LTE confusions and no PCI collisions. The data concerning each
equipment was to choose the most relevant KPIs for detecting frequency band was split into 80% for the training set and
PCI conflicts and RSI collisions. The KPIs were chosen by 20% for the test set. Additionally, as PCI collisions are very
taking into account the theory behind LTE and how PCI and rare, it was decided to do a 50% split for collisions, yielding
RSI are used. Accordingly, the following KPIs were chosen 3 collisions in both training and test sets.
for PCI conflict detection: The cleaned data for RSI collision detection in the 800
• Average CQI - the average Channel Quality Indicator MHz frequency band consisted of 10128 nonconflicting cells
(CQI) that represents the effective Signal-to-Interference- and 6774 RSI collisions. The 1800 MHz frequency band
plus-Noise Ratio (SINR) measured by the UE; data consisted of 17634 nonconflicting cells and 10916 RSI
• UL PUCCH Interference Avg - the average measured collisions. The data relative to each frequency band was split
interference in the Physical Uplink Control Channel; into 80% for the training set and 20% for the test set.
B. Considered Classification Algorithms 1) Peak Traffic Data Classification: PCI conflicts and RSI
In order to reduce the bias from this study, five different collisions can be detected by only analysing KPI values in
classification algorithms were set. The aim of the classifiers the instant of highest traffic of each individual cell. This
was to classify cells as either nonconflicting or conflicting, hypothesis was first proposed because radio network conflicts
depending on the detection use case. The considered classifi- are most noticeable through KPI observation in busy traffic
cation algorithm implementations were taken from the Python hours. Furthermore, analysing only one daily measurement per
Scikit-Learn library [6] and were the following: KPI in each cell considerably reduces the complexity and
1) Adaptive Boosting (AB): is an ensemble method, which processing power needed to detect PCI conflicts and RSI
is a class of a ML approach based on the concept of creating collisions, as the number of data rows per cell are highly
a highly accurate classifier by combining several weak and reduced.
inaccurate classifiers. AB uses subsets of the original data 2) Statistical Data Extraction Classification: PCI conflicts
to produce weak performing models (high bias, low variance) and RSI collisions are better detected by extracting statistical
and then boosts their performance by combining them together calculations from each KPI daily time series and using them
based on a chosen cost function. AB was the first practical as features for classification. The Python tsfresh tool was used
boosting algorithm and remains one of the most used and to extract statistical data from the time series [12]. Tsfresh
studied classifiers [7]; its implementation uses Decision Tree applies several statistical calculations on the data followed
(DT) classifiers as weak learners. by feature elimination through statistical significance testing.
2) Gradient Boost (GB): is another popular boosting algo- As it resulted in hundreds of features, Principal Component
rithm for creating collections of classifiers. It differs from AB Analysis (PCA) was applied for dimensionality reduction
as it calculates a negative gradient of a cost function (direction before applying the data into the SVM classifier. This decision
of quickest improvement) and picks a weak learner that is was taken because SVM takes longer to converge as the di-
closest to the obtained gradient to add to the model [8]. The mensionality increases, while it does not significantly increase
considered GB implementation uses DTs as weak learners. the training and testing times of the tree based classifiers. It
3) Extremely Randomized Trees (ERT): belongs to the was decided to use a number of Principal Components (PC)s
family of tree ensemble methods and uses a technique different that led to 98% of the Cumulative Proportion of Variance
from boosting, known as bagging. Bagging based algorithms Explained, maintaining most of the original variance.
aim to control generalization error by perturbing and averaging 3) Raw Cell Data Classification: PCI conflicts and RSI
the generated weak learners, such as DTs. The ERT algorithm collisions are better detected by using each cell’s daily KPI
stands out from other tree based ensemble classifiers as it measurements as an individual feature. This hypothesis was
strongly randomizes both feature and cut-point choice while proposed to compare a more computationally intensive, but
splitting a tree node [9]. ERT aims to strongly reduce variance, simpler approach, with the previous hypothesis. Moreover, as
through a fully randomization of the cut-point and feature, there are 96 daily measurements per KPI in each cell by using,
combined with ensemble averaging, when compared to other for instance, 10 KPI, would yield 96×10 = 960 features. Due
algorithms. By training each weak learner with the full training to the high dimensionality of the data to test this hypothesis,
set, instead of data subsets, ERT also minimizes bias. PCA was applied (once again) to reduce its dimensionality
4) Random Forest (RF): is another bagging based algo- before using the SVM classifier. It was decided to use a
rithm in the family of tree ensemble methods. Similarly to number of PCs that led to 98% of the Cumulative Proportion
ERT, several small and weak trees can be grown in parallel of Variance Explained.
and these set of weak learners result in a strong classification
algorithm by either averaging or by majority vote [10]. RF is D. Model Evaluation
similar to ERT, but differs in two aspects: it uses data subsets
for growing its trees, while ERT uses the whole training set; it In a binary decision problem, a classification algorithm
chooses a small subset of features to be chosen on splitting a labels predictions as either positive or negative. A prediction
node, while ERT chooses a random feature from all features. for conflict detection could fit into one of these four categories:
5) Support Vector Machines (SVM): aim to separate data True Positive (TP) - conflicting cells correctly labeled as con-
samples of different classes through hyperplanes that define flicting; False Positive (FP) - nonconflicting cells incorrectly
decision boundaries. Similarly to DT based classifiers, SVMs labeled as conflicting; True Negative (TN) - nonconflicting
are capable of handling linear and non-linear classification cells correctly labeled as nonconflicting; False Negative (FN)
tasks. The main idea behind SVMs is to map the original data - conflicting cells incorrectly labeled as nonconflicting.
samples from the input space into a high-dimensional feature As there was a high interest in knowing how well the ob-
space such that the classification task becomes simpler [11]. tained models could classify PCI conflicts and RSI collisions,
the classic Accuracy metric by itself was not enough. Classifi-
C. Proposed Hypotheses cations where a nonconflicting cell is erroneously classified as
In order to reduce bias even further, three hypotheses were a conflict should be avoided, thus it was chosen to additionally
proposed to find the one that leads to the best performing evaluate the obtained models through the Precision and Recall
models for PCI conflict and RSI collision detection. metrics. The used metrics can then be defined as follows:
TABLE I
TP P EAK TRAFFIC PCI C ONFUSION CLASSIFICATION RESULTS .
Recall = (1)
TP + FN
800 MHz Band 1800 MHz Band
TP
P recision = (2) Model Accuracy Precision Recall Accuracy Precision Recall
TP + FP
ERT 84.94% NaN 00.00% 92.43% NaN 00.00%
TP + TN RF 84.94% NaN 00.00% 92.43% NaN 00.00%
Accuracy = (3) SVM 84.94% NaN 00.00% 92.43% NaN 00.00%
TP + TN + FP + FN AB 84.94% NaN 00.00% 92.43% NaN 00.00%
where Recall measures the fraction of conflicting cells that GB 84.01% 29.41% 04.42% 92.13% 03.33% 02.87%
are correctly labeled, Precision measures the fraction of cells
classified as conflicting that are truly conflicting and Accuracy
measures the fraction of correctly classified cells [13]. Preci- In order to see whether or not the GB led to the best
sion can be thought as a measure of a classifier’s exactness – performing model, the PR curves were obtained and presented
a low precision can indicate a large number of FPs – while in Figure 1. The area under each classifier’s curve is its
Recall can be seen as a measure of a classifier’s completeness average Precision. Through a close analysis of the plot for
– a low recall indicates many FNs. the 800 MHz frequency band, it was clear that GB was the
Since a classification algorithm can output the probabilities best performing classifier with precision peaking at 35% until
of a sample belonging to a specific class, the probability reaching 25% Recall. Thenceforth, RF and ERT show higher
decision threshold can be tuned to alter the model classifica- Precision, with ERT having the highest average Precision of
tion outputs. For instance, increasing the probability decision 0.24. Regarding the 1800 MHz frequency band, ERT was the
threshold to classify a specific class, may lead to an increase in best performing one with higher average Precision and also
Precision at the cost of a lower Recall. Precision-Recall (PR) with Precision as high as 80% until reaching 20% Recall.
curves are built by changing the decision probability threshold From that point onwards, its performance was approximately
for a specific class. Thus, it was decided to also evaluate tied with RF and AB. For both cases, SVM was clearly the
models through their PR curves in order to perform a thor- worst performing classifier.
ough model evaluation. PR curves, often used in Information
Retrieval [14], have been cited as an alternative to Receiver 800 MHz Band 1800 MHz Band
Operator Characteristic (ROC) curves for tasks with a large 1.0
ERT (area = 0.24) ERT (area = 0.17)
skew in the class distribution as in PCI conflict detection [15]. RF (area = 0.23) RF (area = 0.16)
0.8 SVM (area = 0.14) SVM (area = 0.09)
Additionally, the average Precision is also represented by the AB (area = 0.22) AB (area = 0.14)
PR curves through the areas under the curves. 0.6 GB (area = 0.23) GB (area = 0.13)
Precision

IV. R ESULTS 0.4


A. Physical Cell Identity Conflict Detection
0.2
1) Peak Traffic Data Classification: The first presented
hypothesis in Section III-C was tested using the data presented 0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
in Section III-A. As the data in this subsection does not consist Recall Recall
of time series, each KPI was considered as a feature. Thus,
Fig. 1. Smoothed Precision-Recall curves for statistical data based PCI
there is a total of 10 features, each corresponding to a KPI confusion detection.
measurement at a daily peak traffic instant.
The optimal hyperparameters to create each model were The training and testing running times to obtain the PR
obtained through a grid search on the training set with 10- curves were also collected. For the 800 MHz frequency band,
fold cross validation, maximizing the Precision metric. After the best performing model, which was obtained from GB, had
training the models, they were tested on the test set, based training and testing times of 11.8 and 0.1 seconds, respectively.
on a decision probability threshold of 50% and the results are GB led to the third fastest model to train and to the fastest
presented in Table I. It should be added that when a classifier one to test. Regarding the 1800 MHz frequency band, ERT,
does not classify any TPs and FPs, the Precision is represented which led to the best performing model, was the second
as a Not a Number (NA N) since it results in a division by fastest model to learn, but the second slowest model to test.
zero. It was clear that GB was the best performing classifier, Specifically, it had training and testing durations of 17.8 and 6
as it was the only one that classified data samples with a seconds, respectively. The learning curves were obtained and
certainty above 50%, but with low Precision and low Recall they showed that the average Precision would only marginally
for both frequency bands. Nevertheless, the best Precision and increase with more data.
Recall was delivered on the 800 MHz frequency band. The The optimal hyperparameters were obtained through grid
remaining models were unable to return any TP and FP. The search, and the test results were collected after training the
aforementioned fact may have indicated that the data did not models. A table with the results is not shown as no tested
have enough information for this classification task. model was able to classify a sample as conflicting. The PR
curves were obtained, but not shown since the maximum
1.0
800 MHz Band 1800 MHz Band
achieved Precision (obtained through SVM) was 6% for ERT (area = 0.26) ERT (area = 0.17)
33% Recall (corresponding to one of three samples correctly RF (area = 0.26) RF (area = 0.17)
0.8 SVM (area = 0.21) SVM (area = 0.07)
predicted) and it would not add much value. Due to the low AB (area = 0.22) AB (area = 0.13)
number of PCI collisions, these results were not significative. GB (area = 0.27) GB (area = 0.18)
0.6

Precision
2) Statistical Data Extraction Classification: The second
presented hypothesis in Section III-C was tested using the 0.4
data presented in Section III-A. Regarding PCI confusion
0.2
detection, tsfresh yielded 798 and 909 significant features for
the 800 MHz and 1800 MHz frequency bands, respectively. 0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Concerning PCI collision detection, a total of 2200 features Recall Recall
were extracted for the 800 MHz that were not selected
Fig. 2. Smoothed Precision-Recall curves for statistical data based PCI
through hypothesis testing, due to the dataset only containing a confusion detection.
marginally low number of 6 PCI collisions. PCA was applied
for dimensionality reduction for a faster SVM convergence.
For PCI confusion detection, it resulted in 273 and 284 PC for and the test results were collected after training the models.
the 800 MHz and 1800 MHz frequency bands, respectively. A table with the results is not shown as no tested model was
able to classify a sample as conflicting. The PR curves were
TABLE II obtained and plotted, which showed a maximum Precision of
S TATISTICAL DATA BASED PCI CONFUSION CLASSIFICATION RESULTS . 23% with 100% Recall by RF while it was approximately
zero for the remaining classifiers; the plot was not illustrated
800 MHz Band 1800 MHz Band
in this paper as it would not add much information.
Model Accuracy Precision Recall Accuracy Precision Recall 3) Raw Cell Data Classification: The third hypothesis
ERT 85.24% NaN 00.00% 93.27% NaN 00.00% presented in Section III-C was tested using the data described
RF 85.24% NaN 00.00% 93.27% NaN 00.00%
SVM 85.24% NaN 00.00% 93.27% NaN 00.00%
in Section III-A. Using each individual KPI measure as a
AB 85.24% 50.00% 02.83% 93.27% NaN 00.00% feature, an average filter with a window of size 20 was applied
GB 85.18% 46.00% 02.43% 93.27% NaN 00.00% to reduce the noise interference. PCA was applied, which
resulted in 634 PCs to be used by the SVM classifier for
The optimal hyperparameters to create each model were both 800 MHz and 1800 MHz frequency bands.
obtained through a grid search on the training set with 10-
fold cross validation, maximizing the Precision metric. After TABLE III
training the models, they were tested on the test set, based R AW CELL DATA PCI CONFUSION CLASSIFICATION RESULTS .
on a decision probability threshold of 50% and the results are
800 MHz Band 1800 MHz Band
presented in Table II. It should be added that when a classifier
does not classify any TPs and FPs, the Precision is represented Model Accuracy Precision Recall Accuracy Precision Recall
as a NA N since it results in a division by zero. The AB model ERT 85.37% 22.22% 00.71% 93.57% 100% 00.45%
RF 85.63% NaN 00.00% 93.60% 100% 00.90%
had the best performance with a 50% Precision for the 800 SVM 85.63% NaN 00.00% 93.54% NaN 00.00%
MHz frequency band. However, no model classified a sample AB 85.63% NaN 00.00% 93.54% NaN 00.00%
as conflicting in the 1800 MHz frequency band data. GB 85.73% 75.00% 01.07% 93.63% 80.00% 01.80%
In order to obtain more insights about the models’ perfor-
mance, the PR curves were obtained and are represented in Once again, the optimal hyperparameters were obtained
Figure 2. The highest average Precision was 27% by using the through grid search, and the test results were collected after
GB classifier. The GB presented the highest Precision mostly model training. The classification results for a 50% decision
throughout the plot. SVM was clearly the worst performing probability threshold are shown in Table III. Overall, GB was
model, specially in the 1800 MHz frequency band. the classifier that led to the best performance, having the
The training and testing running times to obtain the PR highest Accuracy and Recall for both frequency bands, but not
curves were also collected. GB, which resulted in the two the best Precision for the 1800 MHz frequency band. Both
best models, had testing time below one second and training models created by the ERT and RF classifiers had a 100%
time below 30 seconds, for both frequency bands. The learning Precision for the 1800 MHz frequency band, which meant
curves were obtained and they showed that the average Pre- that RF resulted in the best model, as it had higher Recall.
cision would only marginally increase with more data. Thus, In order to see if GB led to the best performing model, the
GB resulted overall in the best performing models for both PR curves were obtained and presented in Figure 3. Regarding
frequency bands by using statistical calculations as features. the 800 MHz frequency band, GB showed the highest average
Regarding PCI collision detection, PCA resulted in 619 Precision with a peak of 60% Precision for 4% Recall.
PCs to be used by the SVM classifier for the 800 MHz Concerning the 1800 MHz frequency band, ERT presented the
frequency band. The optimal hyperparameters were obtained, best average Precision, while GB achieved higher Precision
TABLE IV
1.0
800 MHz Band 1800 MHz Band P EAK TRAFFIC RSI COLLISION CLASSIFICATION RESULTS .
ERT (area = 0.30) ERT (area = 0.26)
RF (area = 0.25) RF (area = 0.16)
0.8 SVM (area = 0.14) SVM (area = 0.07) 800 MHz Band 1800 MHz Band
AB (area = 0.23) AB (area = 0.13)
0.6 GB (area = 0.31) GB (area = 0.22) Model Accuracy Precision Recall Accuracy Precision Recall
Precision

ERT 62.04% 73.91% 02.79% 61.35% 75.00% 00.27%


0.4 RF 61.66% 58.06% 02.95% 61.63% 81.25% 01.19%
SVM 62.42% 54.75% 16.07% 62.19% 57.22% 09.41%
0.2 AB 62.67% 54.79% 19.67% 61.95% 66.67% 03.47%
GB 62.48% 52.22% 34.75% 62.09% 57.42% 08.14%
0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Recall Recall

Fig. 3. Smoothed Precision-Recall curves for raw cell data based PCI model as all behaved similarly. Additionally, for both cases,
confusion detection. SVM was clearly the worst performing classifier with this
data.
for a Recall lower than 5%. Additionally, RF was not the best
performing model as what was seen in Table III. 1.0
800 MHz Band 1800 MHz Band
The training and testing running times for each model ERT (area = 0.50) ERT (area = 0.49)
RF (area = 0.50) RF (area = 0.51)
were obtained. In the 800 MHz frequency band, GB, which 0.8 SVM (area = 0.48) SVM (area = 0.48)
led to the best performing model, had a testing time below AB (area = 0.53) AB (area = 0.51)
0.6 GB (area = 0.50) GB (area = 0.51)
one second and a training time below 14 seconds. Regarding

Precision
the 1800 MHz frequency band, ERT, which led to the best 0.4
performing model, was one of the quickest to train (i.e. 40.3
seconds), but one of the slowest to test (i.e. 1.4 seconds). 0.2
Nevertheless, its overall performance was in near real time.
0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Regarding PCI collision detection, PCA resulted in 634
Recall Recall
PCs for both frequency bands. The test results were collected
with the optimal hyperparameters. The best performing model Fig. 4. Smoothed Precision-Recall curves for peak traffic data based rsi
was the one obtained from AB, as it detected one out of collision detection.
three PCI collisions with 100% Precision. However, due to
the marginally low number of PCI collisions in the dataset, The training and testing running times for each model were
the results were not significant to draw any conclusions. obtained. The fastest obtained model was GB with training
and testing times going as low as 0.1 seconds. Furthermore,
B. Root Sequence Indicator Collision Detection it was one of the best performing models. With these results,
1) Peak Traffic Data Classification: The first presented ERT was chosen for the 800 MHz frequency band due to its
hypothesis in Section III-C was tested using the data presented fast training and testing times as well as its high Precision
in Section III-A. As the data in this subsection does not consist peak of 75% for low Recall. For the 1800 MHz, it was harder
of time series, each KPI was considered as a feature. Thus, to choose the best performing classifier as the results were
there is a total of 6 features, each corresponding to a KPI very similar. However, the GB was chosen as it was the fastest
measurement at a daily peak traffic instant. performing model. The learning curves were also obtained and
The optimal hyperparameters were obtained through grid the main insight that could be taken from them was that the
search, and the test results were collected after model train- average Precision scores were already approximately stabilized
ing. The classification results for a 50% decision probability for the two last training set sizes for both frequency bands.
threshold are shown in Table IV. At first glance, no major Thus, results would not be significantly better if more data
conclusions were taken from the results as the highest metrics was added.
were almost evenly distributed through the models. However, 2) Statistical Data Extraction Classification: The second
the highest Precisions were obtained by the ERT and RF presented hypothesis in Section III-C was tested using the data
models at the cost of delivering the lowest Recall scores. The described in Section III-A. Regarding RSI collision detection,
aforementioned fact may have indicated that the data did not tsfresh yielded 732 and 851 significant extracted features for
have enough information for this classification task. the 800 MHz and 1800 MHz frequency bands, respectively.
In order to see which classifier led to the best performing In order to reduce the data dimensionality for applying to the
model, the PR curves were obtained and presented in Figure SVM model, PCA was applied, resulting in 273 and 284 PCs
4. Regarding the 800 MHz frequency band, it is clear that for the 800 MHz and 1800 MHz frequency bands, respectively.
the curve relative to the AB curve has a strange behaviour. The optimal hyperparameters were obtained through grid
The aforementioned behaviour was due to the AB model search and the test results are presented in Table V. The
assigning several cells with the same probability values. For ERT model delivered the highest precision for both frequency
both frequency bands, there was no clear best performing bands, but GB had the highest Accuracy and Recall, overall.
TABLE V TABLE VI
S TATISTICAL DATA BASED RSI COLLISION CLASSIFICATION RESULTS . R AW CELL DATA RSI COLLISION CLASSIFICATION RESULTS .

800 MHz Band 1800 MHz Band 800 MHz Band 1800 MHz Band
Model Accuracy Precision Recall Accuracy Precision Recall Model Accuracy Precision Recall Accuracy Precision Recall
ERT 60.32% 100% 00.48% 62.27% 72.97% 02.00% ERT 59.49% 50.00% 00.83% 59.83% 75.00% 00.22%
RF 64.93% 61.30% 32.62% 64.13% 66.94% 12.12% RF 61.70% 62.64% 13.52% 65.55% 63.86% 33.07%
SVM 60.94% 54.80% 11.55% 61.79% NaN 00.00% SVM 60.07% 52.24% 16.61% 59.25% 46.67% 09.14%
AB 64.02% 56.79% 40.83% 66.37% 59.88% 36.29% AB 64.73% 60.38% 37.60% 64.99% 59.59% 40.32%
GB 66.87% 61.60% 44.88% 69.39% 63.97% 45.53% GB 66.41% 60.84% 47.92% 66.22% 62.72% 39.52%

1.0
800 MHz Band 1800 MHz Band 1.0
800 MHz Band 1800 MHz Band
ERT (area = 0.54) ERT (area = 0.54) ERT (area = 0.54)
RF (area = 0.58) RF (area = 0.55) RF (area = 0.58)
0.8 SVM (area = 0.48) SVM (area = 0.43) 0.8 SVM (area = 0.44)
AB (area = 0.60) AB (area = 0.61) AB (area = 0.56)
GB (area = 0.61) GB (area = 0.61) GB (area = 0.60)
0.6 0.6
Precision

Precision
0.4 0.4
ERT (area = 0.54)
RF (area = 0.56)
0.2 0.2 SVM (area = 0.45)
AB (area = 0.60)
GB (area = 0.61)
0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.00.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
Recall Recall Recall Recall

Fig. 5. Smoothed Precision-Recall curves for statistical data based RSI Fig. 6. Smoothed Precision-Recall curves for raw cell data RSI collision
collision detection. detection.

In order to gain more insights regarding the models’ perfor- The training and testing running time for each model were
mance, PR curves were obtained and presented in Figure 5. obtained. The GB model showed testing times lower than one
The GB model was the best for both frequency bands, having second and the third highest training times for both frequency
a Precision peak of 85% and an average Precision of 61%. The bands. More clearly, it took 12.8 and 24.4 seconds to train in
abnormal curve behaviour of the AB model was due to the the 800 MHz and 1800 MHz frequency bands, respectively.
assignment of several cells with the same probability values. However, the GB model’s performance was in near real time
The training and testing running times for each model and was thus the best performing model, overall. The obtained
were obtained. The GB model showed testing times lower learning curves showed that the results would improve if more
than one second, however it had one of the highest training data was added to the training set, specially for the GB model.
times. More specifically, it reached 28.4 and 246 seconds of
training time for the 800 MHz and 1800 MHz frequency band, V. C ONCLUSIONS
respectively. Nonetheless, the GB model presented higher This work tests three hypotheses regarding how well two
performance relative to other obtained models with near real distinct LTE network problems can be detected through su-
time performance, thus being the best model, overall. The pervised techniques with near real time performance.
obtained learning curves showed that the performance would The PCI confusions are better detected by using each
not significantly increase if more data was added to the dataset. cell’s daily KPI measurement as an individual feature; this
3) Raw Cell Data Classification: The third presented hy- was concluded due to the obtained average Precision being
pothesis in Section III-C was tested using the data described in higher while testing this hypothesis. Specifically, the average
Section III-A. Using each individual KPI measure as a feature, Precisions reached 31% and 26% for the 800 MHz and
an average filter with a window of size 20 was applied. PCA 1800 MHz frequency bands, respectively. No conclusions were
was applied which yielded in 332 PCs to be used by the SVM taken regarding PCI collision detection due to the low number
classifier for both 800 MHz and 1800 MHz frequency bands of PCI collisions in the data set.
for RSI collision detection. The RSI collisions were detected with similar performance
The optimal hyperparameters were obtained through grid by two proposed hypotheses; however, the best detection was
search and the results are presented in Table VI. Once more, obtained by using each cell’s daily KPI measurement as an
the GB model revealed more accuracy for both frequency individual feature because the learning curves have shown that
bands. The RF and ERT models had the highest Precision the results would further improve if more data was added for
for the 800 MHz and 1800 MHz frequency bands. the second hypothesis. The best performing model was the
The PR curves were obtained and presented in Figure 6. one that used the GB classifier, reaching average Precisions
The GB model had the highest average Precision, while the of 61% and 60% for the 800 MHz and 1800 MHz frequency
RF and ERT models showed slightly worse average Precision. bands, respectively.
ACKNOWLEDGMENTS
The authors would like to thank FCT for the support
by the projectUID/EEA/50008/2013. Moreover, our acknowl-
edgement concerning project MESMOQoE (no 023110 -
16/SI/2016) supported by Norte Portugal Regional Opera-
tional Programme (NORTE 2020), under the PORTUGAL
2020 Partnership Agreement, through the European Regional
Development Fund (ERDF). The authors would also like to
thank CELFINET for the KPI measurements.
R EFERENCES
[1] X. Wang et al., “Experimental comparison of representation methods
and distance measures for time series data,” Data Min. Knowl. Discov.,
vol. 26, no. 2, pp. 275–309, Mar. 2013.
[2] A. Gómez-Andrades et al., “Automatic root cause analysis for LTE
networks based on unsupervised techniques,” IEEE Transactions on
Vehicular Technology, vol. 65, no. 4, pp. 2369–2386, 2016.
[3] H. Holma and A. Toskala, WCDMA for UMTS: HSPA Evolution and
LTE, 4th ed. Wiley Publishing, 2007, ISBN:978-0-470-31933-8.
[4] R. Acedo-Hernández et al., “Analysis of the impact of PCI planning
on downlink throughput performance in LTE,” Comput. Netw., vol. 76,
no. C, pp. 42–54, Jan. 2015.
[5] C. Cox, An Introduction to LTE, LTE-advanced, SAE, VoLTE and 4G
Mobile Communications, 2nd ed. Wiley Publishing, 2014, ISBN:978-
1-118-81803-9.
[6] F. Pedregosa et al., “Scikit-learn: Machine learning in python,” J. Mach.
Learn. Res., vol. 12, pp. 2825–2830, Nov. 2011.
[7] J. Zhu et al., “Multi-class adaboost,” 2009.
[8] A. Natekin and A. Knoll, “Gradient boosting machines, a tutorial,” Front.
Neurorobot., vol. 2013, 2013.
[9] P. Geurts, D. Ernst, and L. Wehenkel, “Extremely randomized trees,”
Machine Learning, vol. 63, no. 1, pp. 3–42, Apr 2006.
[10] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32,
Oct. 2001.
[11] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, “A practical guide to support
vector classification,” Department of Computer Science, National Taiwan
University, Tech. Rep., 2003.
[12] M. Christ, “TSFRESH,” https://github.com/blue-yonder/tsfresh, 2016.
[13] J. Davis and M. Goadrich, “The relationship between precision-recall
and ROC curves,” in Proceedings of the 23rd International Conference
on Machine Learning, ser. ICML ’06. New York, NY, USA: ACM,
2006, pp. 233–240.
[14] C. D. Manning and H. Schütze, Foundations of Statistical Natural
Language Processing. Cambridge, MA, USA: MIT Press, 1999.
[15] M. Goadrich, L. Oliphant, and J. Shavlik, Learning Ensembles of First-
Order Clauses for Recall-Precision Curves: A Case Study in Biomedical
Information Extraction. Berlin, Heidelberg: Springer Berlin Heidelberg,
2004, pp. 98–115.

Das könnte Ihnen auch gefallen