Manual - SQuad Voice Test Result Description

SQuad Voice
Measurement
Description
Manual
February 2009
SwissQual License AG
Baarerstrasse 78
CH-6301 Zug
Switzerland
Internet: http://www.swissqual.com
Office: +41 32 686 65 65
Fax: +41 32 686 65 66
Part Number: 16-100-200047-3 Rev 2.20
Copyright 2000 - 2009 SwissQual AG. All rights reserved.

No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system,
or translated into any human or computer language without the prior written permission of SwissQual AG.
SwissQual has made every effort to ensure that eventual instructions contained in the document are
adequate and free of errors and omissions. SwissQual will, if necessary, explain issues which may not be
covered by the documents. SwissQuals liability for any errors in the documents is limited to the correction of
errors and the aforementioned advisory services.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo
trademark somewhere in your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad as well as the following logos are
registered trademarks of SwissQual AG.
Diversity, NQDI, VMon, NiNA, NiNA+, NQView, NQComm, NQTM, QualiWatch-M,

QualiWatch-S, NQAgent, NQWeb, QPControl, SystemInspector, Diversity Unattended are
trademarks of SwissQual AG.
SwissQual acknowledges the following trademarks for company names and products:
Adobe, Adobe Acrobat, and Adobe Postscript are trademarks of Adobe Systems Incorporated.
Apple is a trademark of Apple Computer, Inc.
DIMENSION, LATITUDE, and OPTIPLEX are registered trademarks of Dell Inc.
ELEKTROBIT is a registered trademark of Elektrobit Group Plc.
Google is a registered trademark of Google Inc.
Intel, Intel Itanium, Intel Pentium, and Intel Xeon are trademarks or registered trademarks of Intel
Corporation.
INTERNET EXPLORER, SMARTPHONE, TABLET are registered trademarks of Microsoft Corporation.
Java is a U.S. trademark of Sun Microsystems, Inc.
Linux is a registered trademark of Linus Torvalds.
Microsoft, Microsoft Windows, Microsoft Windows NT, and Windows Vista are either registered
trademarks or trademarks of Microsoft Corporation in the United States and/or other countries U.S.
NOKIA is a registered trademark of Nokia Corporation.
Oracle is a registered US trademark of Oracle Corporation, Redwood City, California.
SAMSUNG is a registered trademark of Samsung Corporation.
SIERRA WIRELESS is a registered trademark of Sierra Wireless, Inc.
TRIMBLE is a registered trademark of Trimble Navigation Limited.
U-BLOX is a registered trademark of u-blox Holding AG.
UNIX is a registered trademark of The Open Group.
SQuad Voice Measurement Description Manual

2000 - 2009 SwissQual AG
Contents
1
About this Guide................................................................................................................ 6

Introduction .......................................................................................................................... 6
SQuad Listening Quality................................................................................................... 7

Introduction .......................................................................................................................... 7
Speech Quality Definition .................................................................................................... 7
SQuad Method..................................................................................................................... 7
MOS Rating ......................................................................................................................... 8
Speech and Noise Level Received Signal ....................................................................... 9
Channel Gain....................................................................................................................... 9
Clipping.............................................................................................................................. 10
DC-Offset........................................................................................................................... 10
Frequency-Shift ................................................................................................................. 11
Delay Spread (Voice Jitter)................................................................................................11
Speech Threshold..............................................................................................................11
Degradations ..................................................................................................................... 12
AGC Problems ..............................................................................................................12
Speech Enhancer / Noise Suppressors ........................................................................13
Impulsive Noise ............................................................................................................. 13
Background Noise ......................................................................................................... 13
Interruptions .................................................................................................................. 14
VAD resp. Silence Suppression Problems.................................................................... 15
Variable Delay (Voice Jitter).......................................................................................... 15
Delays Deviation ...........................................................................................................15
Frequency Shifts ........................................................................................................... 17
Quality Code..................................................................................................................17
Option: P.862 'PESQ' ........................................................................................................18
SQuad Noise Suppression .............................................................................................20

Introduction ........................................................................................................................ 20
Listening Quality ................................................................................................................ 20
NS-Speech Power Classes ...............................................................................................22
SNRI, Signal-to-Noise Ratio Improvement........................................................................ 23
NPLR, Noise Power Level Reduction................................................................................23
SPLR, Signal Power Level Reduction ...............................................................................24
Overall NS Quality .............................................................................................................25
Quality Index...................................................................................................................... 26
Convergence Time ............................................................................................................ 27
Noise Reduction/Suppression Test ...................................................................................29
Contents

Examples ........................................................................................................................... 30
Evaluation of the transmitted signal...................................................................................30
Evaluation of the transmitted signal...................................................................................31
4
DTMF Tests ...................................................................................................................... 32

Introduction ........................................................................................................................ 32
DTMF-Test Overview ........................................................................................................ 32
Criterions ........................................................................................................................... 33
Results............................................................................................................................... 33
SQuad Advanced Echo Check (Passive Test).............................................................. 36

Introduction ........................................................................................................................ 36
Echo Measurement ...........................................................................................................36
Measurement Results........................................................................................................36
SQuad Advanced Echo Check (Active Test) ................................................................ 40

Introduction ........................................................................................................................ 40
Echo Measurement ...........................................................................................................40
Measurement Results Echo Evaluation ......................................................................... 40
Measurement Results Listening Quality.........................................................................42
Round Trip........................................................................................................................ 43
Introduction ........................................................................................................................ 43
The Round Trip Method..................................................................................................... 43
Results............................................................................................................................... 43
References ........................................................................................................................ 43
Appendix .......................................................................................................................... 44
Abbreviations ..................................................................................................................... 44
Figures
Figure 2-1 Block Diagram of the SQuad........................................................................................... 8
Figure 2-2 Main outcomes of SQuad-LQ ......................................................................................... 9
Figure 2-3 Typical MOS-LQ values for Different Codecs ................................................................. 9
Figure 2-4 Frequency Shift ............................................................................................................. 11
Figure 2-5 Histogram of Noised Speech Sample ........................................................................... 12
Figure 2-6 Level Chart with AGC.................................................................................................... 13
Figure 2-7 Similarity Chart with Impulsive Noise............................................................................ 13
Figure 2-8 Background Noise......................................................................................................... 14
Figure 2-9 Level Chart with Handover............................................................................................ 14
Figure 2-10 Time Clipping .............................................................................................................. 15
Figure 2-11 Variable Delay, Voice Jitter ......................................................................................... 15
Figure 2-12. Example for Variable Delay, which shows that Block B is delayed for 244 samples
to the left (arrives earlier when compared with the same reference block). Block B arrives later by
244 samples. .................................................................................................................................. 16
Figure 2-13 Frequency Shift ........................................................................................................... 17
| Contents

Figure 2-14 P.862 result representation ......................................................................................... 18

Figure 2-15 P.862 vs. P.862.1 scale transformation ......................................................................18
Figure 3-1 The Principle of MOS Calculation in Squad-NS............................................................ 21
Figure 3-2 The Five Energy Windows for 16 bit Digital System (90.3 dB dynamics)..................... 22
Figure 3-3 Speech Power Class Chart ...........................................................................................23
Figure 3-4 SPLR Calculation out of Five Values Calculated in Five Different Energy Windows. .. 24
Figure 3-5 Speech Power Class Chart ...........................................................................................25
Figure 3-6 Overall NS Quality......................................................................................................... 25
Figure 3-7 Calculation of Quality Index .......................................................................................... 26
Figure 3-8 Some Experimental Results for DifferentConfigurationofNR in the Network................ 27
Figure 3-9 MOS vs. MOSobj & Quality Index.................................................................................27
Figure 3-10 Example of Convergence Time Evaluation................................................................. 27
Figure 3-11 Filtered Difference Envelope is Compared with the Threshold Value ........................ 28
Figure 3-12 Five Point Analysis of the Difference Envelope during Decision on Noise Reduction
State ............................................................................................................................................... 28
Figure 3-13 Additional Condition before a Final Decision is Calculated. ....................................... 29
Figure 3-14 Typical Reference Signal with White Noise Added..................................................... 29
Figure 3-15 Noise Suppression Applied on Signal.........................................................................30
Figure 3-16 Noise Reduction Applied on Signal.............................................................................30
Figure 3-17 NS Signal Envelope .................................................................................................... 30
Figure 3-18 Results presentation within NQDI ...............................................................................31
Figure 4-1 Allocation of Frequencies to the Various Digits and Symbols of a Push-button Set .... 32
Figure 4-2 Block Diagram of DTMF-Test........................................................................................33
Figure 5-1 Results of SQuad-AEC Test, shown in NQDI presentation .......................................... 37
Figure 5-2 EOR (Echo Objection Rate) derived from G.131 .......................................................... 38
Figure 5-3 Echo Loss during scanning versus echo delay............................................................. 39
Figure 6-1 Results presentation SQuad-AEC active ...................................................................... 41
Figure 6-2 Echo Loss as profile versus echo delay........................................................................ 41
Figure 6-3 Result presentation in an echo free/non echo detectable connection .......................... 42
Figure 7-1 Detail of the NQDI Representation ...............................................................................43
Tables
Table 2-1 Example for Variable Delay where five blocks are elayed at different offsets regarding
Reference Speech Sample............................................................................................................. 16
Table 3-1 Energy Windows used in Calculation of SPLR and PLR ...............................................22
Table 4-1 DTMF Result Code......................................................................................................... 33
Table 7-1 One Way Delay Quality Classes .................................................................................... 43
Contents

About this Guide
Introduction
This document describes the parameters that are measured with the SwissQual QoS
Measurement System. It also describes briefly the used algorithms as well as some background
information with regards to the causes of different kind of quality degradations. The screenshots
are made from the SwissQuals Post Processing System NQDI.
| Chapter 1
About this Guide

SQuad Listening Quality
Introduction
For network operators or equipment manufacturers, it is important to know where and why the
speech quality may be degraded. Since speech quality is a major factor determining customer
satisfaction, encoding techniques must be designed for optimal speech quality. In order to assess
the quality of speech encoding techniques, large-scale auditory tests are commonly employed.
However, it is practically impossible to reproduce results obtained in such way. Furthermore, such
results are depending on the level of motivation of the individual test candidates. Therefore, it is a
big advantage to have an instrumental method capable of physically measuring speech quality
parameters and producing results, which correlates as closely as possible with subjectively
acquired results. The perfect transmission of speech via a telecommunications channel with a
bandwidth of 0.3 - 3.4 kHz results in a sentence intelligibility of approx. 98%. The speech coders
introduced for handsets used in digital mobile radio networks also further impair intelligibility.
Speech quality is a vague term compared with bit rate, echo or loudness. Since customer
satisfaction can be measured directly by the quality of the transmitted speech, encoding
techniques must be selected and optimized based on their speech quality.
Speech Quality Definition

Speech Quality is defined as a measure of a listeners satisfaction based on his experience and
expectation regarding voice communication. It is generally expressed as a Mean Opinion Score
(MOS). This measurement denotes the average of many individual opinions on speech quality,
which are obtained from a representative number of listeners. Speech quality is a complex
psycho-acoustic phenomenon within the process of human perception. As such, it is necessarily
subjective. Most objective algorithms are based on a comparison between a reference sample
and a coded version of the reference.
SQuad Method
SQuad consists of three main parts. First, a pre-processing unit adjusts reference and coded
sample. Then, an auditory model is used to reduce both samples to their perceptually relevant
features. Finally, an assessment unit evaluates the perceptual difference between reference and
coded sample and outputs the result as a MOS value.
A speech sample is transmitted over a line with generally unknown combination of speech coders.
This speech sample is available in digital form. The sampling frequency is 8 kHz and the digital
quantization is 16 bits. As an initial step, the source speech signal is read into the vector x(i) and
the coded speech signal into the vector y(i). These speech signals are synchronized with respect
to both time and level. The DC offset must be removed from every sample. In addition, the signals
are normalized to a common RMS (Root Mean Square) level, to ensure that the constant
amplification factor is not taken into account.
The signals are split into processing units of 32 ms duration, also called Frames. The unit overlap
is 50%. During the first processing step, the frame is multiplied by a hamming window. The
source signal x(t) in the time domain is now transformed to the frequency domain using a discrete
Fourier transform, followed by computation of the squared magnitude FFT spectrum. Both signals
are filtered using a filter equivalent to the receiving curve of the corresponding telephone handset.
A rough approximation of the time masking is already achieved through the frame overlapping
during the signal pre-processing. The comparison method of SQuad is based on the following
principle; Signal parts with high energy are more important for the perceived speech quality. A
similarity coefficient for reference and impaired signal is computed for 4 different energy
thresholds. Only the parts of the signal exceeding the respective threshold are considered. This
can be viewed as a multi-resolution analysis with respect to signal energy. The overall
Chapter 2 SQuad Listening Quality

similarity is then computed using the coefficients from all thresholds. A polynomial is used to
transform the comparison result to the ITU MOS scale. The length of the speech sample varies
between 4 and 30 seconds.
Degraded
signal
Network
IRS-filtering &
BG Noise
detection
N
U
T
Time &
Level
alignment
Psychoacoustic
s
modelling
Listening only
Quality
Estimation
- round-trip delay
- jitter
Listening only
Quality
estimation
Overall
Q
Audio
Quality
QLQ
- echo
- call setup quality
Frequency
equalization
Referenc
e
Psychoacoustic
s
modelling
Other
measured
data
Figure 2-1 Block Diagram of the SQuad
MOS Rating
Speech Quality is defined as a measure of a listeners satisfaction and is generally expressed as
a Mean Opinion Score (MOS). SQuad delivers MOS rating as one number, ranging 1 to 4.5, fully
in accordance to the Listening Scale defined in ITUs P.800 recommendation. This is not exactly
the same scope as MOS which is defined with 1-5. This is allowed since based on subjective
tests used for the validation of Squad-LQ the values above 4.5 have almost never appeared.
As described in ITUs P.800 recommendation Annex B.4.5, various five-point category-judgment
scales may be used for different purposes. The Listening Only quality scale is the most
frequently used for ITU-T applications:
Quality of the speech
Score
Excellent
Good
Fair
Poor
Bad
The following picture gives an overview about the obtained results in the main section of NQDI:
| Chapter 2

Figure 2-2 Main outcomes of SQuad-LQ
Codec
Typical MOS
Value
Typical
SQuadLQ
G.711
4.3
4.4
G.729
3.8
3.7
G.723.1
(6.3)
3.5
3.5
GSM-EFR
4.0
3.9
GSM-HR
3.4
3.3
AMR 12.2
4.0
3.9
AMR 7.4
3.8
3.7
AMR 4.75
3.4
3.4
Figure 2-3 Typical MOS-LQ values for Different Codecs
Speech and Noise Level Received Signal

Within the SQuad-LQ algorithm itself also the Active Speech Level (acc. to ITU-T P.56) of the
received signal is calculated. This value describes the r.m.s level of active speech parts only.
Speech pauses will not influence that value. The result is given in dBov. The level should be in a
range from -20 -38dBov. Related to a sending level of -26dBov it corresponds to a
gain/attenuation of +6 / -12 dB.
The Noise Level describes the noise floor of the received signal in speech pauses in dBov too. In
normal noise-free connections a Noise Level of below -55dBov can be obtained. Please note that
is in an un-weighted level, a common A-weigthing is not applied.
Both results are used to calculate a basic signal-to-noise ratio, which describes the distance
between the speech level and the noise floor.
Channel Gain
This is a value in dBr, which shows the power level of the received signal relatively to the
reference (input) signal. Because, SQuad-LQ is applied to the electrical interfaces of the
connection, the terminal depending Send Loudness Rating (SLR) and the Receive Loudness
Rating (RLR) as well are modelled in SQuad-LQ itself. In Principle, SQuad-LQ is connected to the
so-called 0dbr-point of the networks input. At this 0dBr point a nominal level of -26dBov
(corresponds to -20dBm at a four-wire 600 Ohms interface) will be inserted.
The Channel Gain reflects only gains or attenuation caused by network (exception: attenuating
PSTN subscriber loops). It is close to the so-called JLR (Junction Loudness Rating) but does not
apply any spectral weighting.
In a transparent ISDN connection the Channel Gain should be around 0 dB. In principle also in a
Mobile-to-ISDN or Mobile-to-Mobile connection this value should be around 0dB too. Caused by
individual signal amplifications of cellular network providers this value might differ. Mainly they
amplify the signals, so a gain in the positive range can be observed. If a overall gain of 6dB is

exceeded, amplitude clipping may occur. This will lead like in a real call to quality impacts and
result in a lower SQuad-LQ score.
On the other way around, an attenuating PSTN subscriber loop may lead to negative Channel
Gains because it is part of the evaluated transmission chain. Like a PSTN phone, which is more
sensitive, also SQuad-LQ gain internally such attenuated signals to a nominal level of -26dBov
(corresponds to 79dB sound pressure level at the subscribers ear).
To inform the user of SQuad-LQ, within NQDI Channel Gains outside of the expected range are
highlighted. The expected range is here +6-9dB and in an extended range down to -15dB.
The Channel Gain is available as a single overall value in dBr (total Gain) but also as a range of
values in the time domain (every 16ms) like a an attenuation profile. Based on this attenuation
profile values a chart can be created providing information on:
AGC (Adaptive Gain Control) Elements that are not working correctly
Level Jumps (for example after a handover)
Level Interruptions (for example interruptions in the audio path or during handovers)
Clipping
Temporal Speech Clipping (also called front-end clipping) is the loss of speech frames. It may
occur when voice activity detection is used, when Digital Circuit Multiplication Equipment
(DCME) is used or during uncontrolled slips. Time clipping is presented as clipped frames in a
function of time.
Clipping is an annoying phenomenon that cuts off a bit of speech in the instant it takes for the
transmitter to detect presence of speech. It is almost impossible to eliminate clipping in a
traditional circuit-switched voice conversation. Using circuit switching, the transmitter is not turned
on until sound is detected, and by then, a piece of the speech has been clipped off. SQuad
detects this clipping and generates the results as a distribution of time. The resolution of the
clipping measurement is 8 milliseconds. First, the mean energy per 8 milliseconds is calculated.
The energy values are then saved for each frame (both reference and coded). After the whole
speech sample has been processed, the post processing of time clipping data is done. There are
some simple rules during this post-processing:
Time clipping can only occur during transitions pause-speech.
Minimum pause length must be reached. In our case, it is 64 milliseconds.
The difference Energy (ref) Energy (cod) must be at least 10 dB.
Clipped frames are succeeding frames.
The clipping measurement values are indicated as an average % value per sample (number of
active speech frames / number of clipped speech frames) and as a time domain distribution. Time
Clipping in SQuad-LQ is calculated each 8 ms, but only an average value of two succeeding
frames is reported in output file.
DC-Offset
This number shows the DC-Offset of the coded signal in percentage. This is an important piece of
information if the measured speech quality is lower than expected. Various interface problems
(impedance, coding technique, HW) can produce DC-offset discrepancies.
DC Offset is calculated as
100 * average_audio_voltage / Max_audio_voltage
Max_audio_voltage for 16 bit digital resolution is equal 2^15 (32768).
For example: average_audio_voltage=300 results in DC_Offset=100*300/32768=0.91%
10
| Chapter 2

Frequency-Shift
A low bit rate encoder can move the formants (spectral peaks) of the speech. This degradation
can be described as frequency shift of one or more components of the source signal. This
drift is measured as a percentage of moved frequency components in the speech active phases.
The result is a number of pos- and neg -shifted frames in %, reflected in a compressed
frequency (bark). Figure 2-2 shows a typical situation for one processing buffer of voice signal
(32ms).
Figure 2-4 Frequency Shift
For the detection of the frequency shift, the peaks above the loudness threshold in both
reference- and degraded-signals are analyzed. The threshold for compressed loudness is set to
10. The position of each peak in the reference is compared with the position of the peak in the
coded signal (within +/- 1Bark). Frequency shift is found if the location of the two peaks is not at
the same. The amplitude of the coded and reference loudness must not be equal but above the
threshold value. This is allowed because the level- and frequency-alignment is done previously in
a separate module.
Typical Network Elements that are responsible for frequency shift are:
Very low bit rate vocoders
Speech enhancer (Noise suppressors)
Non linear filter elements
Delay Spread (Voice Jitter)

The first stage in SQuad-LQ is the time alignment. This stage is able to deal with variable
delays, which can occur in packet networks, normally indicated with big jitter/delay or packet loss
values. It collects information about shifted frames by comparison with reference speech sample.
The result of this alignment is a delay distribution of the coded signal. A histogram will be
presented which shows the number of speech frames reflected in arrival time (delay) in
milliseconds. The results are generated for each 32ms frame.
Speech Threshold
This is a value in dBov, which shows a level of the speech in a coded signal. The measurement
is based on building of r.m.s. histograms for both coded and reference signals. dBov means
decibel relative to a digital over-load point. The range for this value is 90 to 0 dBov. For signals
containing background noise, this value is between 55 to 40 dBov.
11

A histogram evaluates an individual frequency for a set of data bins. The result is a number of
occurrences of a value in a data set. A histogram table presents the energy-grade boundaries and
the number of scores between the lowest bound and the current bound.
Energy Histogram for noisy signal
25
Noise position
Count
20
Speech level
15
Bound position
10
5
-15.0
-16.6
-18.2
-19.7
-21.3
-22.9
-24.5
-26.1
-27.7
-29.3
-30.8
-32.4
-34.0
-35.6
-37.2
-38.8
-40.4
-41.9
-43.5
-45.1
-46.7
-48.3
-49.9
-51.5
-53.0
0
RMS of the coded Signal (dB)
Figure 2-5 Histogram of Noised Speech Sample
In Figure 2-4, is shown an example of a histogram for noisy speech signal of 10 seconds duration.
In SQuad-LQ, internally, the histogram is presented with 50 bins between minimum r.m.s and
maximum r.m.s. values. There are two maxima, one for speech-pauses and one for speech active
intervals. In our example, the first maximum is found at 45.1 dB, which is level of silent intervals.
Second peak is at about 26 dB which is speech active level. Speech threshold measured in
SQuad-LQ is defined as a boundary between these two peaks (Bound position).
Degradations
The below list present some possible degradation reasons for the Listening Quality Value using
a clean reference sample:
AGC (Adaptive Gain Control) Elements
Speech Enhancer / Noise Suppressors
Impulsive Noise
Background Noise
Interruptions
VAD (Voice Activity Detectors)
Variable Delay or Jitter in Packet Networks
AGC Problems
Indications: LQ less than expected, Level Chart indicates an abnormal level trend.
Example of an AGC of a mobile handset that attenuates too strong toward the end of a sample:
12
| Chapter 2

Figure 2-6 Level Chart with AGC
Speech Enhancer / Noise Suppressors

Indications: LQ less than expected. Similarity Chart shows a bigger degradation over the
complete speech signal. This must be checked with the SQuad Noise Suppression Test.
Impulsive Noise
Indications: LQ less than expected. Similarity Chart shows a lot of quite big degradation peaks.
Figure 2-7 Similarity Chart with Impulsive Noise
Background Noise
Indications: LQ less than expected. Signal Envelope Chart shows some additional energy
during the speech pause.
Example:
13

Figure 2-8 Background Noise
Interruptions
Indications: LQ less than expected. Similarity Chart shows blue bars and Signal Envelope
indicates a peak drop.
Example of an Interruption due to a Handover (interruption is indicated in blue):
Figure 2-9 Level Chart with Handover
Interruption measurement is based on processing frames of 32ms duration.

Such frames are divided into 16 sub-frames (2ms) in order to achieve better resolution. For each
sub-frame, the signal level for both reference and degraded is calculated. Interruption flag for subframe is set to TRUE if the level in the reference signal is higher then 35 dBov (r.m.s. = 400) and
the level in the degraded signal is lower then 61 dBov (r.m.s. = 20).
The result is the ratio of sum of sub-frames with signal interruption and total nr. of sub-frames
(16).
Interruption _ result =
Nr _ Of _ IntFrames
16
Interruption result is in the range 0 and 1 with step=1/16. If only one sub-frame is lost (interrupted)
in the signal, then is Interruption=1/16=0.0625. When the signal in all sub-frames is deleted (lost)
then is Interruption=1.
14
| Chapter 2

VAD resp. Silence Suppression Problems

Indications: LQ less than expected. Average clipping values are high; Clipping Chart shows
some significant clipping.
Example: Clipping at the beginning of the sentence. At the top is shown the signal envelope chart,
at the bottom the clipping chart.
Figure 2-10 Time Clipping
Variable Delay (Voice Jitter)

Indications: LQ less than expected. Variable Delay Chart shows some delay values. This
typically happens if there was a packet network (or backbone) used and if there were Jitter buffers
used.
Figure 2-11 Variable Delay, Voice Jitter
Delays Deviation
DelaysDeviation is placed in the section !SQuad_LQ_AVG (in Squad result file) and is
defined as an absolute value of the standard deviation of block delays (D), divided by an average
of block delays [in samples]. The duration of one sample at 8000 Hz, sampling frequency is 125
s. DelaysDeviation shows the smoothness of an array of delays. Small DelaysDeviation
value means there is a uniform delay distribution, where a large value indicates a big delay-jitter
like in IP networks. For only one single delay, this value is equal zero.
15

stdev( D)
DelaysDeviation = fabs
average( D)
Example: Coded file has fixed offset of 1024 samples to the reference file. Six blocks with
different variable delays are found with Squad-LQ:
Table 2-1 Example for Variable Delay where five blocks are elayed at different offsets regarding Reference
Speech Sample
Block
Delays (D) in
samples
All
1024
780
-244
1024
244
1536
512
1024
-512
1304
280
N D 2 ( D )
Stdev(D)=
=241.53
average(D)=
1
N
D =1115.3
DelaysDeviation=241.53/1115.3=0.217
Delay Spread is also another important parameter, which describes the maximum delay
amplitude calculated over all single group delays. Based on the example above, we can calculate
new Delay Values (D), which are scaled D values by subtracting a fix delay from all other
values.
D"i = Di fix _ delay

For example, fix_delay=1024 samples.
Figure 2-12. Example for Variable Delay, which shows that Block B is delayed for 244 samples to the left
(arrives earlier when compared with the same reference block). Block B arrives later by 244 samples.
Delay Spread is calculated as a distance between the minimum and the maximum block delay. In
our example, minimum value is 512 samples and maximum is +512 samples. So the distance
between max and min equals 1024 samples. This is then converted to time in ms.
DelaySpread = Delay _ smp smp _ duration

smp _ duration =
1
Fs
Fs = 8000 Hz
In the calculation for our example, we get the value for DelaySpread=1024/8000=128 ms.
16
| Chapter 2

Frequency Shifts
The distribution of the frequency shifts is shown in the histogram below, with the number of
frames in which a shift at a certain frequency occurred. The diagram covers the whole range of
frequencies in steps of 31.25 Hz.
Figure 2-13 Frequency Shift
Quality Code
The thresholds for each degradation descriptor is, as follows:
MOS-drops
Quality distribution is unsteady such as during handovers or interruptions.
Received signal level out of recommended range
The level difference to the reference level exceeds +9dB or falls below -12dB.
Signal interruptions
Temporal clipping for more then 8 ms.
High DC-Offset
Malfunction of terminal or interface card. DC-Offset > 0.2%.
Variable delay
Indicates possible packet-switched transmission.
Variable delay during speech
Same as Variable delay but occurring during speech active intervals.
Background noise
High level of circuit noise. Higher then 50 dBov.
Impulse noise
Relay/switching problems detected. More then 1 pulse / second.
Low bitrate coding / coding artefacts
Low bit rate coding scheme has been used (e.g. Less then 8 kbit/s) or residual errors from
decoding are introduced (e.g. by frame loss concealment).
Not Specified
signalizes that the speech quality is degraded but no outstanding reason for that degradation
could be classified.
OK
shows that the speech quality is nearly non-degraded
Furthermore, special problems in the audio-path will be reported:
Silence/Audio Level Too Low
There is no signal activity in the audio path or the signal level is below -45dBov. SQuad-LQ will
not calculated since it will lead to misleading results.
Corrupted Signal/Wrong Reference
17

Here the received audio signal is heavily corrupted (e.g. only partly transmitted or the audio
stream was lost completely). Such a behaviour can observed e.g. during a call drops. Normally,
SQuad will score those signals with close to 1.0. For statistical reasons, NQDI allows the
exclusion of such results from the reporting.
This indicator will also signalize if a wrong reference signal was used for SQuad.
Option: P.862 'PESQ'

Optionally, the SQuad-LQ framework can also include ITU-T P.862 'PESQ' an additional model
for objective speech quality prediction. The principle function of P.862 as a psycho-acoustic
driven comparison method is very close to SQuad-LQ and so the received signals can be
evaluated also by P.862.
The ITU-T Recommendation P.862 was finalized 1999 and approved in February 2000. It was
trained over huge amount of databases mainly from codec standardization activities in ITU-T.
If the P.862 option is used in SQuad-LQ, the SQuad-framework will report two additional quality
results:
P.862 score: raw outcome of the ITU-T algorithm
Listening Quality (P.862.1) : transformed result according to P.862.1 into a MOS-scale from 15
Figure 2-14 P.862 result representation
That both results are basing on the same algorithm. The transformation according to P.862.1
describes only a scale mapping.
5.0
Scale limit = 4.5
4.5
4.0
P.862.1
3.5
3.0
2.5
2.0
1.5
1.0
-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
P.862
Figure 2-15 P.862 vs. P.862.1 scale transformation
Note:
The P.862 results are a bit lower in tendency compared to SQuadLQ especially in the range from 3.0 4.0. It is mainly caused by a high
sensitivity of P.862 regarding clipping and time-variant filtering.
It has taken also into account that P.862 does not rate any linear distortions such as frequency
responses. Those linear distortions will be compensated completely by P.862 itself before the
quality prediction starts.
18
| Chapter 2

The P.862 option has to be enabled by the test-type 'Speech-P.862' and requires a special
software key.
19

SQuad Noise Suppression
Introduction
The noise suppression is a feature designed to enhance speech quality in a range of
environments where there is significant (acoustic) background noise. The noise suppression
function is a pre-processing module that is used to improve the signal to noise ratio of a speech
signal prior to voice coding.
For noise suppressors, there are certain requirements that need to be fulfilled:
The noise suppression function must not have a statistically significant distorting effect on
clean speech in comparison with the performance of the speech codec without noise
suppression applied.
The noise suppression function must not introduce any degradation of speech and no
undesirable effects in the residual noise when there is (acoustic) background noise in the
speech signal.
DTMF and other signalling tones transmission performance during the application of noise
suppression shall be no worse than when noise suppression is turned off.
The above requirements are all checked with SQUAD Noise Suppression test.
The algorithm measures the Noise Power Level Reduction (NPLR) and Signal-to-Noise Ratio
Improvement (SNRI), similar to the definitions in ETSI STC SMG11 (GSM 06.77) document. A
comparison of the SNRI and NPLR measures are used to acquire an indication of possible
speech distortion produced by the tested NS method.
For the Noise Suppression test, two reference signals are used:
Clean speech reference
Clean speech with background noise
The sample with background noise is sent as a test sample.
Listening Quality
Speech Quality is measured according to ITUs P.800 where the coded file and the clean
reference are inputs for SQuad LQ algorithm. The algorithm is elaborated in Section 2.
The Listening Quality evaluation is running twice. The LQ of the noised input signal is estimated
and the LQ of the de-noise output signal as well. From both results the change of the speech
quality is derived.
20
| Chapter 3

Figure 3-1 The Principle of MOS Calculation in Squad-NS.
First, the internal reference (MOS_ref) is calculated. Degraded signal is assessed by comparing it
with the clean reference. Result is presented on CCR scale by subtracting MOS_ref from the
measured MOS.
Speech quality measurement in noisy environment is done by sending the noisy reference
through the network under test. The noisy reference is made by adding a noise signal to the clean
reference. Comparing the clean reference with the coded signal would not produce stable results,
since the SNR of the noisy reference will impact the results. To make this measurement
independent from the reference properties, MOS_ref is calculated first. MOS_ref defines the
reference speech quality, which will be measured in degraded signal if there would not be any
degradations or improvements in the network. This value is mostly lower then 4.5 (excellent
quality) because of noise influence. The range of MOS generated by Squad-NS is 3.5 to +3.5,
which is slightly different from the ITU definition.
Comparison Category Rating (CCR)
The range of the Comparison Category Scale (CCR) as defined in source ITU P.800:
3:
Much Better
2:
Better
1:
Slightly Better
0:
About the Same
1:
Slightly Worse
2:
Worse
3:
Much Worse
The CCR methods are particularly useful for assessing the performance of telecommunications
systems when the input has been corrupted by background noise. An advantage of the CCR
method over the other scales is the possibility to assess speech processing that either degrades
or improves the quality of the speech.
Chapter 3 SQuad Noise Suppression
21

NS-Speech Power Classes

SNRI and SPLR are calculated once as overall average values and once per speech power class.
There are 6 different power classes:
Definition:
ETSI
Performance objective:
Speech level = -26 dBov, determined according ITU-T P.56
Table 3-1 Energy Windows used in Calculation of SPLR and PLR
Range Description
Level class
high power frames
> speech level 1 dB
medium power frames
> speech level 10 dB
low power frames
noise only frames
< speech level 19 dB
noise

-1
pause frames
-2
not used for calculation
unused
Figure 3-2 The Five Energy Windows for 16 bit Digital System (90.3 dB dynamics)
22
| Chapter 3

Example of the Signal Level Group Chart:
Figure 3-3 Speech Power Class Chart
SNRI, Signal-to-Noise Ratio Improvement

Definition:
ETSI
6 dB or higher
Formula:
SNRIx = per speech power class,
SNRI = 1 / (Nh + Nm + Nl) * (Nh * SNRIh + Nm * SNRIm + SNRIl * Nl)
SNRIx = 10 * ( log(SNRcod_x) log(SNRref_x) )
SNRy_x = cod / nse
Range
Description
Quality
SNR from coded and reference signal are equal
<0
SNRref > SNRcod, lower SNR in coded signal
worse
>0
SNRref < SNRcod, higher SNR in coded signal
better
no improvement
NPLR, Noise Power Level Reduction

Definition:
ETSI
-7 dB or lower
Formula:
NPLR = 10 * ( log(PLcod_nse) log(PLref_nse) )
Range
Description
Quality
Noise levels from coded and reference signal

are equal
no noise reduction
<0
PLref > PLcod, lower noise level in coded signal
good
>0
PLref < PLcod, higher noise level in coded signal
bad
23

SPLR, Signal Power Level Reduction

Both SNRI and NPLR are defined in ETSIs document TS 101 512, V8.0.0. Signal Power Level
Reduction (SPLR) is a SwissQuals improvement of NPLR measurement, where NPLR is a
subset of SPLR. SPLR is the difference between coded and reference energy, calculated
separately for each energy window.
Note:
Noise reduction should reduce only noise parts in a signal.
The definition of Windows is given in Table 3. The aim of this measurement is to detect the
influence of noise reduction circuits on speech parts of the signal.
Five SPLR values are calculated: SPLRh , SPLRm , SPLRl , SPLRn and SPLR p . SPLRn is
equal to NPLR value. Good noise reduction would generate SPLRh closed to zero and
SPLR p below 10 dB. The trend curve down through these five values shows the quality and
ability of noise reduction circuit to reduce only noisy frames and to keep unchanged the speech
active frames. In other words, the first coefficient (a) of the trend curve y=ax+b must be negative
(see example in Figure 16). The SPLR measure in SquadNS algorithm is equal to this coefficient
(a) of the trend curve.
Figure 3-4 SPLR Calculation out of Five Values Calculated in Five Different Energy Windows.
The bottom picture shows good noise reduction, whereas on the right is shown poor noise
reduction.
24
| Chapter 3

SPLR is then mapped to a new range 1 4.5 (like MOS scale). This mapping from SPLR to
SPLRm is shown in Figure 16. SPLRm > 2.5 should be achieved for good noise reduction.
Figure 3-5 Speech Power Class Chart
Overall NS Quality
Figure 3-6 Overall NS Quality
25

Quality Index
Figure 3-7 Calculation of Quality Index
The calculation of Quality Index is done by using of four input parameters: NPLR, SNRI, SPLR
and MOS_acr
Quality Index was introduced first in SW Release 2.2. Four values: SMOS, SNRI, NPLR and
SPLR are combined into one objective number. SMOS is measured with SQuad-LQ, where the
clean reference is compared with the coded signal. The range for Quality Index is 1 to 4.5 (like for
MOS). Rating 1 is standing for bad quality and 4.5 for excellent one. The following equation
shows the calculation of quality index based on four input parameters previously scaled into range
1-4.5.
Qidx = NPLR NPLRm + SNRI SNRI m + SPLR SPLRm + MOS MOS
=1
Note:
The Quality Index describes the performance of the noise
reduction system in combination with the network and not the Listening
Quality of the de-noised signal.
The following table shows some measurement examples for different network conditions including
noise reduction effects:
26
| Chapter 3

Figure 3-8 Some Experimental Results for DifferentConfigurationofNR in the Network
Figure 3-9 MOS vs. MOSobj & Quality Index
The Quality Index correlates much better with MOS_CCR than MOSobj based only on speech
quality evaluation.
Convergence Time
For the measurement of the Convergence Time in a noisy signal, the algorithm examines the first
two seconds of the given signal. For the calculations it uses the filtered difference between the
coded and the reference signal (red color, see Figure 3-6).
Figure 3-10 Example of Convergence Time Evaluation
First it checks whether the signal belongs to the noise or pause group and then it compares data
with the set threshold. The threshold is calculated as NPLR + 25 (default, use PERCENT to
change) percent of the difference between the maximum value of the filtered signal in these first 2
seconds and noise level afterwards (NPLR). If the filtered data is lower than the threshold the first
condition for the convergence is fulfilled (see Figure 3-7).
27

Filtered difference
Convergence
0
-5
11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191
-10
-15
-20
Threshold
-25
Figure 3-11 Filtered Difference Envelope is Compared with the Threshold Value
The second condition is that the signal has a falling tendency. To verify that we check 5 (default,
use CT_NR_POINTS to change) equally spaced points over the tested convergence time. In case
of the falling signal the difference in values between every two consecutive points has to be less
than zero. In Fig. 21, we see that the difference between signal values in third and forth points is
bigger that 0, which signifies raising tendency of the signal. Here we perform additional check to
clarify what is actually going on.
30
20
10
190
183
176
169
162
155
148
141
134
127
120
113
99
92
106
-10
85
78
71
64
57
50
43
36
29
22
15
Threshold
-20
-30
-40
Figure 3-12 Five Point Analysis of the Difference Envelope during Decision on Noise Reduction State
This test is based on the average level of the signal before and after the first convergence
criterion is met. If the average level of the signal after falling below the threshold is less than that
threshold, and the average level of the signal before that point, is higher than the same threshold,
we say that the signal has converged. If not, the algorithm continues searching for convergence
until the end of 2 seconds buffer.
28
| Chapter 3

30
20
10
Mean signal level

before threshold
191
181
171
161
151
141
131
121
111
101
-10
91
81
71
61
51
41
31
21
11
Threshold
-20
-30
Mean signal level

after threshold
-40
Figure 3-13 Additional Condition before a Final Decision is Calculated.
The levels before and after place of discontinuity are calculated.
Noise Reduction/Suppression Test

This measurement gives us general information about the type of noise treatment applied in
communication channel. Based on measurement of the differences between signal power level
reductions in speech power classes, the algorithm decides whether noise suppression or noise
reduction was applied. Experimental results have shown that it is necessary to define one more
situation. In a case that signal power level reduction in pause class is higher than SPLR in high
class + offset of 3 dB, we say that the noise in communication channel wasnt treated in either
way. Reference suitability is a fourth possible result coming from this measurement.
Figure 3-14 Typical Reference Signal with White Noise Added
The noise suppression and the noise reduction are both used to enhance speech quality in a
range of environments where there is significant (audible) background noise (see Fig. 24). The
noise suppression reduces the noise in pause and noise power classes, and has very little, or not
at all, influence on higher power classes (speech active intervals). To draw a distinction, the noise
reduction reduces the noise equally in all power classes. Therefore we have based our algorithm
on measurement of the difference between signal power level reduction in high and medium
speech power class. If the difference between two levels is less than a calculated threshold we
say that the noise suppression was applied.
29

Figure 3-15 Noise Suppression Applied on Signal
Figure 3-16 Noise Reduction Applied on Signal
An important role has the level of noise in the reference signal. If the signal to noise ratio of the
reference signal is higher than 30 dB, we say that reference signal is not good for conducting the
measurement, due to too low noise level. The same reference SNR is used for calculating the
threshold offset between SPLRs of high and medium power classes.
Examples
The Signal Envelope shows that the noise is really reduced and the speech part is more or less
the same as for the reference signal.
Figure 3-17 NS Signal Envelope
Evaluation of the transmitted signal

In addition to ratings of the noise reduction systems described above, the results of the listening
quality evaluation of the transmitted and de-noised signal is given as well in separate section.
Here the SQuad-LQ is applied on the transmitted signal.
30
| Chapter 3

Figure 3-18 Results presentation within NQDI
Besides of the pure LQ also the speech and noise levels are shown. Furthermore, the clipping
value can be used for evaluation non-linear processings by the NS device.
Evaluation of the transmitted signal

ETSI TS 101 512, V8.0.0, (GSM 06.77 version 8.0.0 Release 1999), Digital cellular
telecommunication system (Phase 2+), Minimum Performance Requirements for Noise
Suppresser, Application to the AMR Speech Encoder
ETSI TS 101 745, V8.0.0, (GSM 02.76 version 8.0.0 Release 1999), Noise Suppression for the
AMR Codec, Service Description, Stage 1
ETSI TS 101 831, V8.0.0, (GSM 06.78 version 8.0.0 Release 1999), Digital cellular
telecommunication system (Phase 2+), Results of the AMR Noise Suppression Selection Phase,
Application to the AMR Speech Encoder
31

DTMF Tests
Introduction
In telecommunications today, the most used signalling system is DTMF signalling. DTMF stands
for Dual Tone Multi-Frequency. As the name suggests, the DTMF signal consists of two
superimposed sinusoidal waveforms with frequencies chosen from a set of eight standardized
frequencies.
When a DTMF signal is sent over a network it can be degraded, especially when it is encoded.
For an operator of a network, it is of interest to know if the receiver of the DTMF signals can
convert the DTMF signal back into a digit or a symbol. The objective is to measure the percentage
of detected and undetected DTMF digits.
In the first part of SwissQual's algorithm for DTMF test, the algorithm scans through a given signal
and detects the locations of DTMF signals. Once a DTMF signal is found, the algorithm calculates
the characteristics and decides if the signal is valid. If the tone is invalid, the DTMF-Test
describes which condition that was not accomplished. The algorithm collects all characteristics
and saves them in a file.
The DTMF signal used for tests, which consist of two frequencies. According to the CCITT
Recommendation Q.23 [5] and Q.24, there are two frequency groups, each with four frequencies:
The figure below shows how the frequencies are allocated to the various digits and symbols of a
push-button set. Every digit and symbol consists of a frequency from the low and the high group.
Figure 4-1 Allocation of Frequencies to the Various Digits and Symbols of a Push-button Set
DTMF-Test Overview
One or more DTMF signals are sent over a network. The coded signal will be used for the DTMFTest. This signal is available in digital form; the data format is PCM (without compression). The
sampling frequency is 8 kHz or 16 kHz. The digital quantization of the signal can be 8 bit
(unsigned or signed) or 16 bit (big or little endian). Inside, the algorithm works with 16 bit
32
| Chapter 4
DTMF Tests

resolution. The figure below illustrates the basic algorithm of DTMF-Test. DTMF-Test saves its
result in a comma delimitated text file.
Figure 4-2 Block Diagram of DTMF-Test
Criterions
The objective of the SwissQual model for DTMF testing is to measure the percentage of
undetected DTMF digits processed through the network. The DTMF signals are generated at the
frequencies specified in the ITU-T Rec. Q.23.
The algorithm follows the ETSI guidelines defined in "TS 101 235-1" "Technical Specification of
Dual Tone Multi-Frequency (DTMF)".
The received DTMF signal shall be detected as valid when:
Only two of the signalling frequencies are present, one from the high group and one from the low
group, fulfilling the conditions as described above
Each of these signalling frequencies are within +/-(1,5 % +2 Hz) of the nominal value
The level of each of these two signalling frequencies is within the range -27 dBV to -5 dBV
The difference in level of these two signalling frequencies is not more than 6 dB.
Results
Table 4-1 DTMF Result Code
Code
Description
Tone Length
Length of a DTMF Tone
Pause Length
Pause Length between two DTMF tones
Chapter 4 DTMF Tests
33

Code
Description
Measured Level
Average Level of a DTMF Tone
Level Deviation
Level Deviation of the two frequencies of a DTMF Tone
Freq. Low
Low Frequency value in Hertz
Freq. High
High Frequency value in Hertz
DevFreqLow [Hz]
Deviation of the low frequency from the standard in Hertz
DevFreqHigh [Hz]
Deviation of the high frequency from the standard in Hertz
DevFreqLow [%]
Deviation of the low frequency from the standard in percent
DevFreqHigh [%]
Deviation of the high frequency from the standard in percent
Twist [dB]
Level difference between the high and the low frequency
Signal Valid
Signal valid code:

Cause: Valid (0)
If the received tone matches all conditions, then the signal is valid
and the field 'SignalValid' is set to '0'
Cause: TooShort (-1)
If the received tone is too short (<40ms) then the signal is invalid
and the field 'SignalValid' is set to the code '-1'
Cause: NoDigit (-2)
If the received tone does not contain the two frequencies as
specified, the signal is invalid and the field 'SignalValid' is set to '-2'
Cause: LowLevel (-3)
If the Noise Level is more than 10% of the Signal Level of a tone,
then the signal is invalid and the field 'SignalValid' is set to '-3'
Cause: FreqDeviation (-4)
If one of the two frequencies is out of the specified range (+/- 1.5%),
then the signal is invalid and the field 'SignalValid' is set to '-4'
Cause: LevelDiff (-5)
If the level difference of the two frequencies for one tone is more
than 10dB, then the signal is invalid and the field 'SignalValid' is set
to '-5'
Cause: Unknown (-6)
If there is no tone at all, then the signal is invalid as well and the field
'SignalValid' is set to '-6'
Cause: TooLong (-7)
If the received tone is too long (>90ms), then the signal is invalid
and the field 'SignalValid' is set to the code '-7
Signal Match
Signal match code:

Cause: NotRegular (A)
34
| Chapter 4
DTMF Tests

Code
Description
If a tone in the coded file is too short, or wrong in any other aspect, it
is not matched with the reference tone the field 'SignalMatch' is set
to the code 'A'
Cause: AdditionalNotRegular (B)
If there are one or more irregular tones with no reference, the field
'SignalMatch' is set to the code 'B'
Cause: MissingTone (C)
If a reference tone has no match but there are two or more irregular
tones, the field 'SignalMatch' is set to the code 'C'
Cause: MultipleMissingTones (D)
If there are two or more reference tones with no match but three or
more irregular tones, the field 'SignalMatch' is set to the code 'D'
Cause: MultipleMissingTones (E)
If there are more reference tones with no match than irregular tones,
the field 'SignalMatch' is set to the code 'E'
Cause: MultipleTone (F)
If a reference tone has two or more matching tones in the coded file,
the field 'SignalMatch' is set to the code 'F'
Cause: AdditionalTone (G)
If there is no reference for a tone in the coded file, the field
'SignalMatch' is set to the code 'G'
Cause: MissingTone (H)
If for a reference tone there is no tone in the coded file, the field
'SignalMatch' is set to the code 'H' Cause: Disparity (I)
If the number and order for a string of tones in reference and coded
file cannot be matched, the field 'SignalMatch' is set to the code 'I'
Chapter 4 DTMF Tests
35

SQuad Advanced Echo

Check (Passive Test)
Introduction
The measurement application of the Acoustic Echo Check (also called: Acoustic Echo Check) can
be applied to a SwissQual measurement probe at the far end side as well as to any number on
that an automatic hook-up device is connected. SQuad-AEC does not require any artificial test
signals but it is optimized to detect echoes by using human speech as measuring signal. So it
works for all technologies that serve voice communications and make the algorithm ready for inservice live monitoring.
The SQuad-AEC measurement will detect echoes in that active connection by sending a speech
signal to the far end side and observing the receiving direction for any reflections. If no signal is
inserted at the far end side the procedure is measuring during that single talk situation only.
Because commonly used Non-Linear-Processors like VAD's suppress low power send signals
also echoes will not occur in such connections. Therefore the SQuad-AEC algorithm is also
designed to detect echoes during double talk situations. Such a double talk situation may simulate
by an active playing answering station or by using a real phone at the far-end side and talking in
during the measurement. This double talk at the far end side will switch through the sending path
and also the echo can be transmitted.
The SQuad-AEC Test is especially designed to detect electrical as well as acoustical echoes and
is able to detect 'dry' and 'hallow' acoustical echoes as well as hybrid echoes and is more robust
against double talk. In case of a 4-wire connected the far-end station the echoes introduced by
the network will be found. By using real (echo-producing) terminals the insertion of echoes
caused by network AND the terminal can be calculated.
Echo Measurement
This Advanced Echo Check Passive Test (AEC passive) does not simulate anything on BSide. The A-Side starts a call and after B-Side has answered the call; the collecting of the downlink (B->A) audio stream is started. When the recording of the stream has finished, the search of
echo signal is started by comparing the registered signal with the reference signal.
On the B-Side, we can use any (self-answering) voice terminal or a SwissQual Diversity
measurement probe. The AEC algorithm is able to detect echo in presence of background noise
and double talk.
The algorithm is running in two steps:
Observing a wide range of echo delay for possible echoes (scan procedure)
Analysing accepted echo regions in detail for calculating the echo loss and the other results
Measurement Results
The AEC algorithm generates the following results:
Signal type
Echo Delay in milliseconds
Echo Loss during Single Talk acc. ITU-T G.122
Echo Loss for the complete signal (incl. Double Talk)
Echo Objection Rate acc. ITU-T G.131 in %
36
| Chapter 5
SQuad Advanced Echo Check (Passive Test)

Distance to 1% Echo Objection Rate acc. ITU-T G.131

ECHO status
GSM3.50 test
Double Talk Ratio
Level of Received Signal in dBov
Signal type can be Echo, SideTone, Double Talk, Silence or combinations of them.
As Sidetone received signal parts will be rated, which are correlated to the send signal and were
received with less than 20ms delay. Double Talk will be signalized if in more than 40% of the
signal duration the receiving signal is exceeding the defined double talk threshold.
A Signal type NoEcho informs that the connection is echo-free or the echo will be not
perceptible. Additionally, a signal type NoEchoFound signalizes that the echo detection was
hardly disturbed be unexpected distortions in the signal. It might be echo-free but also can include
a masked echo.
An additional parameter shows the Double Talk Ratio in %. The Double Talk Ratio shows in
which ratio the found echo is superset by Double Talk. Only signal parts are defined as
doubletalk, where the Near-End sending signal exceeds -36dBov (approx. -30dBm) and the FarEnd signal is -48dBov (approx. -42dBm) in minimum. In case of strong echoes this double talk
threshold will be increased to minimize the classification of strong echoes as double talk. 1
Echo Delay in milliseconds is a time offset between the reference- and the returned echosignal. A range of values is between 0 and 1000 ms. If no echo was detected the Echo Delay is
0.0 ms.
Echo Loss is the weighted echo signal level measured relative to the reference signal level. The
calculation is in accordance to ITU-T G.122. If no echo was detected the Echo Loss is 99dB,
which is no echo in fact. Under the assumption of an SLR + RLR = 10dB a TELR = Echo Loss +
10dB can be estimated by using the Echo Loss result. SLR stands for Send Loudness Rating and
RLR for Receive Loudness Rating.
Figure 5-1 Results of SQuad-AEC Test, shown in NQDI presentation
Echo status shows the grade of annoyance of the echo signal. There are three possible values:
GOOD stands for good echo performance, FAIR for an acceptable echo performance and POOR
for annoying echoes. Using the Echo Objection curves given in ITU-T G.131 the Echo Status is
derived. Therefore the TELR is estimated by adding 10dB to the Echo Loss and the
corresponding cross-point between the TELR and half of the Echo Delay (= one way transmission
time of the echo).
Please remark that in case of pure single talk situations a powerful echo region might classified
as Double Talk and a Double Talk Ratio of some percentage will shown if the adaptive Double
Talk Threshold is exceeded. This might be observed especially for time varying echo paths.
Furthermore, a huge mount of noise in the receiving part may be classified as double talk too.
Chapter 5 SQuad Advanced Echo Check (Passive Test)
37

Figure 5-2 EOR (Echo Objection Rate) derived from G.131
The GSM 03.50 test is defined in ETSI GSM 03.50 (Section 3.4) and derives the required
Terminal Coupling Loss (TCL) from the G.131 TELR (talker echo loudness rating) chart. Under
the assumption of a no-loss 4-wire connection from the measuring point to the terminal, the ERL
can be interpreted directly as the TCL, because the terminal itself is the only existing source of
echoes. Thus, SQuad-AEC measures the TCL value directly in this case. Thus, we can set:
TCL = TELR - (SLR + RLR) dB, where typically SLR + RLR = 10 dB
TCL = TELR - 10dB
TCL should be ideally 40 dB to 46 dB. 46dB is derived from 1% EOR curve of G.131, with
maximum delay (about 400 ms). If a TCL of higher than 46 dB is reached, the 1% EOR curve will
never pass even for high delays (if no other echo sources besides the terminal exist).
If the measured TELR = TCL + 10 dB is higher than the 1% EOR, the GSM3.50 test shows a
passed value. In the case of a lower TCL, this value is considered to be in the failed range.
EOR (Echo Objection Rate) in % is an estimate of the percentage of the listeners who has
perceived a talker echo when listening to a given telephone setup. ITU-T G.131 shows two
different curves one for 1% EOR and one for 10% EOR. It is assumed that a set of equally
shaped curves will describe each EOR between 0100%. Based on the described crossing point
of estimated TELR and the half of the Echo Delay a (theoretical) corresponding curve can be
derived and the assumed EOR can be taken.
If this EOR is less than 1%, the Echo Status is also GOOD, if this EOR is above 10% it is POOR.
Between both values the echo will be rated as FAIR.
The Distance to 1% EOR is also calculated directly from the chart given in ITU-T G.131. This
value gives the distance to the 1% EOR curve for the calculated echo delay. All negative values
are in the 'green region,' values above 10dB are in the 'red region.'
Echo Loss profile: The Echo Loss is shown graphically versus the delay time. This figure is for
detailed information and should visualize the echo region found. This Echo Loss profile is the one
result of the scan process and represents only situation during single talk. This profile is used for
pin-pointing echoes only. The detailed echo analysis itself will be done in a separated step and
therefore the echo loss can not derived directly from this curve.
38
| Chapter 5
SQuad Advanced Echo Check (Passive Test)

Figure 5-3 Echo Loss during scanning versus echo delay
The Level of Received Signal in dBov gives only information about the r.m.s. level of the
received signal at all. It covers echoes, double talk sequences and noises.
Chapter 5 SQuad Advanced Echo Check (Passive Test)
39

SQuad Advanced Echo

Check (Active Test)
Introduction
The active named measurement application of the Acoustic Echo Check can be applied only to an
SwissQual measurement probe at the far end caused by active actions has to be done. It is
generating an echo at he far end side.
Echo Measurement
The SQUAD AEC active measurement is using the same echo detection approach as the passive
measurement described above. Compared to the passive measurement, where the far-end side is
silent in the active mode the far-end side will create an echo actively. The SQUAD AEC active
measurement includes an inband synchronization between both sides. In a first communication
the incoming signal will be recorded at the far-end side. Based on that signal an echo is
generated by applying selectable echo path responses on that. If required, the generated echo
can be interlaced with double talk. During receiving the signal second time, the pre-processed
echo will be played back to the sending side for evaluation.
This measurement is especially designed to detect and rate echo cancellers or suppressors in the
network. The generated echo will challenge these echo cancellers and possible integrated levelswitching devices will be forced by an inserted double talk signal.
The echo-detection is more confident if the remaining echo has linear components. Especially
during double talk, only linear dependent echoes can be detected. In connections including low
bit-rate codecs and/or non-linear processors the residual or low echoes might be non-detectable
by the measurement. For more confidence chose higher echo levels to increase their
differentiation from doubletalk and other non-linear components.
Measurement Results Echo Evaluation

The SQuad AEC active measurement generates the same results as described in Chapter 5 .
Signal type
Echo Delay in milliseconds
Echo Loss during Single Talk acc. ITU-T G.122
Echo Loss for the complete signal (incl. Double Talk)
40
| Chapter 6
SQuad Advanced Echo Check (Active Test)

Echo Objection Rate acc. ITU-T G.131 in %

Distance to 1% Echo Objection Rate acc. ITU-T G.131
ECHO status
GSM3.50 test
Double Talk Ratio
Level of Received Signal in dBov
Furthermore, SwissQuals database interface NQDI displays the settings used at the far end
together with the measurement results:
Figure 6-1 Results presentation SQuad-AEC active
The results in the Figure presented should be used here for discussion of the results as well. The
measurement was done in a Mobile to PSTN connection. The PSTN-side was the echo
generating loop. At this side the incoming signal was convoluted by the echo path response M1
from ITU-T G.168 (G168_M1). Afterwards it was attenuated by 20dB and interlaced by a double
talk signal containing 50% active speech (dt_50_08kHz.wav). An additional delay was not chosen
at PSTN side.
The results show that this echo was detected at the mobile side. The echo path delay of 224ms is
typical for a Mobile to PSTN connection. The echo loss over the complete signal is 21dB, which
reflects pretty well the range of the defined echo at the PSTN side. The echo loss during single
talk is a bit lower, which signalizes that there is an active component reducing the echo in speech
pauses at least a bit. 2
Using this results the corresponding Echo Objection Rate is calculated (here: 54%) and the
distance to the G.131 1% curve (12dB) as well. That means a increasing of the echo loss by 12dB
would be necessarily to reach the 1% curve and therefore the echo status good.
Consequently, by the reached results the echo status is rated as poor.
Additionally, the Double Talk Ratio is 47%, which is caused by the defined signal at the far-end
side.
Figure 6-2 Echo Loss as profile versus echo delay
Also for SQuad-AEC in the active mode the echo loss profile is displayed.
If no echo is found or if it was not detectable, only the status messages and the level of the
received signal will be displayed:
Please note that the channel gain will also influence the measured echo loss. Basically, the
channel attenuation in both directions has to be added to the defined echo loss at far-end side.
The measuring signal is attenuated due to the transmission from A to B, is there attenuated again
(during the defined loss value) and will be attenuated again due to the transmission from B to A
again. The echo loss reflects the level of the received echo compared to the original measuring
signal.
Chapter 6 SQuad Advanced Echo Check (Active Test)
41

Figure 6-3 Result presentation in an echo free/non echo detectable connection
Measurement Results Listening Quality

Within the SQuad AEC active measurement an additionally evaluation of the Listening Quality
is integrated. Here the SQuad-LQ is applied on the signal received by the echo-generating farend side.
So a simple Listening Quality measurement can be done in parallel. The interesting point is here:
How the Listening Quality is affected by double talk/echo in the other direction. By comparison of
both Listening Quality values the double talk capability can be evaluated. If a network is fully
duplex both Listening Quality values should be the same even a double talk signal is chosen.
The right value gives the Listening Quality for the first transmission where no echo or double talk
is played back. The right value gives the LQ during the echo / double talk is sent at the same
time. In addition the channel gain and the clipping of the received signal is also given.
Please note that a strong side-tone at that B-side may affect the SQuad-LQ measurement,
because it interleaves with the received and evaluated signal.
42
| Chapter 6
SQuad Advanced Echo Check (Active Test)

Round Trip
Introduction
The Round Trip Time is the time a signal needs to travel from the near end side to the far end
side and back. The Round Trip Time is mostly close to the delay of the latest possible echo. The
time speech needed to travel from one talker to the other (One Way Signal Delay) is an
important indicator of the conversational quality of a call. A travel time that is too high leads to the
annoying effect that the talkers interrupt each other unintentionally.
The Round Trip Method

The RTT inband measurement measures the Round Trip Time of a connection by using short
voice-like sequences. This guarantees the transmission over the complete link and avoids
suppressions how it may happen in case of artificial signals like sweeps or impulses. The RTT
inband measurement is a point-to-point measurement, i.e. an A-Side user calls a B-Side user.
After a successful call establishment, the A-side sends the RTT synchronisation signal
(RTTvoiceA) three times one after another but separated by a silence gap of 5.4s to the B-Side.
After receiving this sequence the B-Side sends back the RTTvoiceB sequence. In comparison to
a pure reflecting at B-Side, the usage of different sequences at A- and B-Side avoids the
suppression of the reflected signal by an echo-compensation system in the network. In minimum
two of three samples has to be detected at the
A-Side again.
Results
The measurable Round Trip Time is limited from 4ms in minimum to 3000ms in maximum; the
maximal delay jitter between the three repetitions within one measurement has to be below
500ms. The results of the measurement are presented in the following table. In addition, the
lowest of the one way and round trip time of the measurements in milliseconds is shown as final
results.
Figure 7-1 Detail of the NQDI Representation
The quality classes according to ETSI:

Table 7-1 One Way Delay Quality Classes
One Way Delay
4 (BEST)
3 (HIGH)
2 (MEDIUM)
1 (BEST
EFFORT)
< 100 ms
< 100 ms
< 150 ms
< 400 ms
References
ETSI TS 101 329-2 V1.1.1 (2000-07), Part 2: Definition of Quality of Service (QoS) Classes
Chapter 7 Round Trip
43

Appendix
Abbreviations
Abbreviation Description
ACR
Absolute Category Rating
CELP
Code Excited Linear Prediction
DCR
Degradation Category Rating
DMOS
Degradation Mean Opinion Score
MOS
Mean Opinion Score
dBov
dB relative to the overload point of a digital system
ADPCM
Adaptive Differential Pulse Code Modulation
BFI
Bad Frame Indication
CCITT
Comit Consultatif International Tlgraphique et Tlphonique (The

International Telegraph and Telephone Consultative Committee)
CDMA
Code-Division Multiple Access
CRC
Cyclic Redundancy Check (3 bit)
DAC
Digital to Analogue Converter
DMR
Digital Mobile Radio
DTMF
Dual Tone Multi-Frequency (signalling)
DTX
Discontinuous Transmission (mechanism)
EPROM
Erasable Programmable Read Only Memory
ETR
ETSI Technical Report
ETS
European Telecommunication Standard
ETSI
European Telecommunications Standards Institute
FER
Frame Erasure Ratio
FR
Full Rate
GMSK
Gaussian Minimum Shift Keying (modulation)
GSM
Global System for Mobile communications
44
| Appendix A

Abbreviation Description
GSM MS
GSM Mobile Station
HANDO
Handover
HDLC
High level Data Link Control
HR
Half Rate
IEC
International Electro-technical Commission
ISDN
Integrated Services Digital Network
ISO
International Organization for Standardization
ITU
International Telecommunication Union
LAN
Local Area Network
MSC
Mobile-services Switching Center, Mobile Switching Center
OSI
Open System Interconnection
PABX
Private Automatic Branch eXchange
PDN
Public Data Networks
PSPDN
Packet Switched Public Data Network
PSTN
Public Switched Telephone Network
QOS
Quality Of Service
RXLEV
Received signal level
RXQUAL
Received Signal Quality
S/W
Software
SIM
Subscriber Identity Module
SS7
Signalling System No. 7
TDMA
Time Division Multiple Access
TE
Terminal Equipment
VAD
Voice Activity Detection
Appendix A
45

Manual - SQuad Voice Test Result Description

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Manual - SQuad Voice Test Result Description

Hochgeladen von

Copyright:

Verfügbare Formate

SQuad Voice

Copyright 2000 - 2009 SwissQual AG. All rights reserved.

Diversity, NQDI, VMon, NiNA, NiNA+, NQView, NQComm, NQTM, QualiWatch-M,

SQuad Voice Measurement Description Manual

About this Guide................................................................................................................ 6

SQuad Listening Quality................................................................................................... 7

SQuad Noise Suppression .............................................................................................20

SQuad Voice Measurement Description Manual

DTMF Tests ...................................................................................................................... 32

SQuad Advanced Echo Check (Passive Test).............................................................. 36

SQuad Advanced Echo Check (Active Test) ................................................................ 40

SQuad Voice Measurement Description Manual

Figure 2-14 P.862 result representation ......................................................................................... 18

SQuad Voice Measurement Description Manual

About this Guide

About this Guide

SQuad Voice Measurement Description Manual

SQuad Listening Quality

Speech Quality Definition

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-1 Block Diagram of the SQuad

Quality of the speech

SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-2 Main outcomes of SQuad-LQ

Figure 2-3 Typical MOS-LQ values for Different Codecs

Speech and Noise Level Received Signal

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-4 Frequency Shift

Delay Spread (Voice Jitter)

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-5 Histogram of Noised Speech Sample

SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-6 Level Chart with AGC

Speech Enhancer / Noise Suppressors

Figure 2-7 Similarity Chart with Impulsive Noise

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-8 Background Noise

Figure 2-9 Level Chart with Handover

Interruption measurement is based on processing frames of 32ms duration.

SQuad Listening Quality

SQuad Voice Measurement Description Manual

VAD resp. Silence Suppression Problems

Figure 2-10 Time Clipping

Variable Delay (Voice Jitter)

Figure 2-11 Variable Delay, Voice Jitter

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

D"i = Di fix _ delay

DelaySpread = Delay _ smp smp _ duration

SQuad Listening Quality

SQuad Voice Measurement Description Manual

Figure 2-13 Frequency Shift

Chapter 2 SQuad Listening Quality

SQuad Voice Measurement Description Manual

Option: P.862 'PESQ'