Sie sind auf Seite 1von 6

Statistical Guidelines for

Collecting KPI Data

Technical Reference
Statistical Guidelines for Collecting KPI Data 2006-04-12

© Ericsson TEMS AB 2006. All rights reserved.

No part of this document may be reproduced in any form without the written permission of the copyright
holder.

TEMS is a trademark owned by Telefonaktiebolaget L M Ericsson, Sweden. All other trademarks belong
to their respective owner.

The contents of this document are subject to revision without notice due to continued progress in
methodology, design and manufacturing. Ericsson shall have no liability for any error or damage of any
kind resulting from the use of this document.

Public
2006-04-12 Statistical Guidelines for Collecting KPI Data

Contents

1. Introduction .................................................................1

2. KPI Data Sample Sizing..............................................1


2.1. Estimation of Proportional Frequencies ...................................1
2.2. Estimation of Time, Data Rate, and Other Quantities..............2

TMS-06:000204 Uen Rev A


2006-04-12 Statistical Guidelines for Collecting KPI Data

1. Introduction
The accuracy of statistics is a function of the amount of data on which it is based. To
be able to compute KPIs (Key Performance Indicators) with a specified accuracy,
you must have collected a data sample of appropriate size. This document explains
how to calculate that sample size given the accuracy requirement.

2. KPI Data Sample Sizing

2.1. Estimation of Proportional Frequencies


Many of the KPIs calculated by TEMS Investigation are estimates of the proportional
frequency (probability) of a specific event (e.g. Data Transfer Cut-off Ratio). The
larger the measurement sample, the more reliable the estimate. Given a desired
level of accuracy, the required sample size may be calculated as follows.

If we denote the sample size by n and the estimated proportional frequency (i.e. the
measured KPI) by f, then, using the Gaussian approximation of the binomial
distribution, we obtain the 95% confidence interval (CI) for the true proportional
frequency p as

f (1 − f )
p = f ± 1.96 ⋅
n

This means that the true proportional frequency p falls within the confidence interval

⎡ f (1 − f ) f (1 − f ) ⎤
⎢f − 1.96 ⋅ , f + 1.96 ⋅ ⎥
⎣⎢ n n ⎦⎥

with 95% probability.

In other words, if we want a confidence interval [f – e, f + e] for p, the required


sample size is approximately

1.96 2
n= ⋅ f ⋅ (1 − f )
e2

Example:

The Attach Failure Ratio for packet-switched is estimated at f = 10% with a sample
size of n1 = 200 measurements. By the formula for p, the uncertainty of the
measurement at the 95% confidence level is e1 = 4.16%. Thus, with 95% probability,
the true proportional frequency p falls within the interval [5.84%, 14.16%].

To reduce the uncertainty to ±3% at the same confidence level, a sample size of n2
= (1.962 / 0.032) · 0.1 · 0.9 = 384 would be needed.

TMS-06:000204 Uen Rev A 1(2)


In practice, at least 200 measurements should be performed for all KPIs defining
success or failure ratios, and preferably 500 if at all practicable.

2.2. Estimation of Time, Data Rate, and Other


Quantities
The remaining KPIs in TEMS Investigation constitute estimates of some quantity p
(e.g. setup time, data rate, quality) in the form of a measurement average p.
According to the central limit theorem, the distribution of the underlying
measurement samples is asymptotically Gaussian. Therefore the uncertainty of the
estimation (again assuming a 95% confidence interval) is

e = 1.96 ⋅ s n

where the standard deviation s is the square root of the variance

∑ (x − x )2
s 2
=
n −1

and n is the sample size. The sample size corresponding to a predefined estimation
accuracy (confidence interval [f – e, f + e]) can therefore be calculated as

1.96 2 s 2
n=
e2

If measurements show a higher standard deviation than expected, the sample size
must be increased further in order to give the desired accuracy.

Example:

Attach Setup Time is estimated from a sample of size n1 = 200. The measured
standard deviation is s = 250 ms. Then e1 = 1.96 · 250 / 2001/2 = 49 ms. If the
measured average Attach Setup Time is 1820 ms, we can be 95% confident that the
true average Attach Setup Time is between 1771 and 1869 ms.

To shrink the confidence interval so that e2 = 20 ms, a sample size of n2 = 1.962 ·


2502 / 202 = 600 would be required.

2(2) Public