Sie sind auf Seite 1von 19

VoIP QoS in practice: About Network Congestion

by Smartvox on October 26, 2011


My previous two articles explored QoS tagging of voice data packets using ToS/DiffServ values
and of Ethernet frames using CoS or Priority values. QoS is often advocated as an essential part
of any self-respecting VoIP solution and there is no doubt it can make a big difference in the right
circumstances. However, it would be a mistake to expect too much of QoS or to assume that it
will always make a difference no matter what. To understand where and how it can help we need
first to examine the underlying causes of network congestion how and where it can happen.
What causes network congestion?

A network is a mesh of interconnected nodes. The nodes are pieces of network equipment, like
routers and Ethernet switches, and the interconnections could be fibre optic or Ethernet (copper)
cables.

Lets be clear, the data being transmitted over a single copper pair or fibre-optic cable all travel at
the same speed you do not get frames overtaking each other! So the points at which network
congestion can occur are the nodes. Congestion happens at a node when the rate of ingress of
data exceeds the rate at which the data can be forwarded to the next destination, the next node.

How does network equipment respond to congestion?

There are potentially several ways that the network equipment could deal with congestion and
the actual method will depend in part on the capabilities of the equipment itself. For example, it
might be possible for a downstream node to signal to the upstream node to stop sending for a
while, to pause. This method is referred to as flow control.
Another option, if the router is able to detect congestion in a downstream route, then it might be
able to send data via a different, less congested route. This would require quite a sophisticated
device that is able to learn about alternative routes and evaluate them to choose the preferred
route under different circumstances of network congestion. It would need to be able to read and
interpret the Explicit Congestion Notifications coming back to it in the Layer 3 packets.
The mechanism that is of most interest to us now, however, is the prioritisation of packets or
frames within the network equipment (the devices at the network nodes). As we will see, this is
dependent on the presence of memory-based data buffers within the routing equipment, the
management of virtual queues and algorithms that determine how data are assigned to and
removed from the queues.
Flow control

Flow control is generally only possible with a full duplex interconnection. A flow control
mechanism may take the form of a PAUSE Frame in Layer 2 or an ICMP Source Quench
request in Layer 3. The concept is illustrated below:

However, flow control is not a complete solution and is not always possible. Consider, for
example, the case of a media stream carrying voice or video such data is time-critical and
cannot therefore tolerate significant interruptions. Indeed, if flow control is not going to simply
push the problem back onto other upstream nodes, then the flow control signal would have to be
sent all the way back to the original source that is sending the data. That may not be possible and,
even if it is, the source may simply not be able to pause the ultimate source for an audio media
stream is the person speaking and they are hardly likely to be controlled by a flow control signal
in the network.

Memory buffers

No matter which mechanisms are used to manage congestion, there is an inescapable necessity
for the network equipment at each node to have some form of memory-based buffering.
Buffering allows it to receive data and decide how to handle it before sending it on to the next
node. Without buffering, any situation where data are being received at the equipment faster than
they can be transmitted out the other end has only one option: the excess must be discarded (so
called packet loss).

All network equipment, beyond a simple hub, must therefore have some memory buffering.
Furthermore, the core mechanism for discrimination and prioritisation of network traffic works
by using these internal buffers to queue the data.
On a very basic router or switch the memory buffer would be quite small without being subdivided it would behave as a simple queue where new data are added to the back of the queue
while the older data at the front of the queue gets processed (so-called FIFO). On the more
sophisticated and expensive boxes the buffers are larger and are sub-divided into multiple queues
as new data packets arrive they are assigned to the back of the most appropriate queue
depending on their QoS settings.
The software running on the router takes packets from the front of each queue in a pre-defined
way that allows some queues to have a higher priority than others. The algorithms used to
process these queues have to be clever enough to take account of preferences for low latency and
risk of packet loss basically trying to prioritise some queues while avoiding the risk that the
lowest priority queues never get a look-in. If the buffers are getting full, flow control would be
attempted. Failing that, when packets have to be discarded, the algorithms might look at the QoS
settings to determine the drop priority and thereby choose which ones to drop.

Limitations of memory-based buffering

While buffering of data is essential and is integral to the prioritisation of network traffic and the
general management of bandwidth, it is not a panacea. In fact, the more that buffering is used,
the greater the latency (delay). Furthermore, while buffering may help to avoid packet loss, it
cannot prevent it if the ingress rate at a node exceeds the outflow rate for a long period of time.
In effect, the buffering just helps to smooth out short-term peaks by time-shifting the excess
input at one point in time to a quieter period a little later.

The shaded area of the graph, where inbound transmission rates exceed the maximum outflow
rate, equates to data that must be temporarily put into the buffer. Only when the inbound rate
drops below the dotted red line is it possible for the equipment to empty its buffers. If the rate
stays above the red line for too long, then the buffers would become full and packets would have
to be dropped.
The implications for practical application of VoIP QoS

We need to draw some conclusions from this exploration of network congestion issues. What
does it mean in practical terms for VoIP system designers and IP-PBX installers? Based on the
above points and also drawing upon my own experience of real-world installations, I would
suggest that the following conclusions can be drawn:

Network congestion happens at nodes where the data ingress rate exceeds
the outflow rate. If this speed disparity exists for too long, then packet loss
is likely to occur. If there is no congestion then QoS is more or less irrelevant.

QoS influences how network congestion is handled. It can determine which


packets are most likely to be dropped if buffers get full and it can influence
the latency (delay) of certain data streams by prioritising one stream with
respect to another.

Network equipment may allow a fixed proportion of the overall bandwidth to


be set aside for a particular QoS class of traffic this type of bandwidth

management will normally allow lower priority data to borrow some of the
reserved bandwidth when the high priority traffic is not using it.

Just using QoS settings to mark packets or frames as high priority as they
leave your IP phone or IP-PBX, guarantees absolutely nothing about how they
will be handled.

Effective bandwidth management is possible within your LAN or corporate


WAN where you have control over the topology of the network, the choice of
equipment and its configuration. The network equipment must support traffic
shaping or traffic prioritisation and be configured to recognise and use your
chosen QoS tags such as DiffServ or CoS.

Once network traffic goes outside your own network infrastructure it is quite
likely that QoS tags will be overwritten or ignored. This is especially true
when traffic leaves your premises to traverse the Internet using an ordinary
broadband connection. If you need end-to-end real time QoS over a
broadband Internet connection you will need to look closely at the packages
being offered by different service providers and you must expect to pay a
premium for a broadband connection that supports it.

In most cases, QoS is far more relevant to transmissions in one direction than
in the other. For example, if network congestion is happening at a node
because the outbound connection is slower than the inward one then the
chances are that for data travelling the other way there will be no problem.
Furthermore, you may be able to set the QoS tags on packets you are
sending, but you may not be able to set it on the ones you are receiving.

Bandwidth management is a multi-faceted problem: there may be trade-offs


between decreasing latency vs increasing packet loss; getting faster
throughput of voice almost certainly means there will be times when there is
slower throughput for other data and could even result in packet loss for
other data.

Topics for further discussion and reading


If time permits, I will write a follow-up article looking at how bandwidth management works in
practice; in particular how prioritisation using QoS is generally implemented within a network
switch or router. If you want further reading, a good starting point is wikipedia and Google try
looking up the following topics: dropped packets, latency, jitter, weighted fair queuing, QoS.

VoIP QoS Settings part 1


by Smartvox on June 17, 2011
Ads by Google:
The QoS settings on VoIP phones and related equipment can be perplexing. Here, I will attempt
to explain what parameters like CoS, ToS, DiffServ and DSCP really mean and offer practical
suggestions for the values that should be assigned to them.
Part 1 of this article starts with a broad overview and then focuses on Layer 2 network QoS
settings; i.e. the settings associated with the VLAN tag in 802.1 Ethernet protocols. If you are
looking for information about Layer 3 settings such as DSCP, ToS and IP Precedence, then you
may prefer to skip straight to part 2.

What is QoS?
QoS (Quality of Service) is a somewhat all-encompassing term and means different things to
different people, even within the context of Voice over IP. The broader issues of audio quality on
VoIP calls, echo cancellation and the mechanisms within end-points designed to compensate for
packet loss and jitter are discussed in a different article here. The QoS settings that this article
seeks to explain are all directly related to frame and packet tagging for prioritisation of network
traffic. Even restricting ourselves to this specific area, there is enough material to fill at least one
decent sized book. In order to keep it to a manageable scale for this article Im going to have to
brush over some of the details very quickly. I apologise in advance if some experts out there find
my condensed descriptions over-simplistic or inaccurate.

The structure of Network Traffic


Traffic on a network is passed around as a series of discrete chunks of data called frames,
packets or messages depending which level they exist at. The structure of frames and
packets generally conforms to a common pattern in so far as they all could be described as an
envelope containing (a) information about the packet itself and (b) the payload. The term
payload is used to describe the core data that needs to be sent from A to B. Information about
the packet or frame typically includes the source and destination address, but may also include
other parameters such as a description of the packets importance or priority. It is these priority
fields that I will be discussing here.
The actual structure of a frame or packet will conform to certain industry standards (without this,
you would not be able to interconnect equipment from different manufacturers). For example, the
IEEE 802 standards define what Ethernet frames look like. Internet Protocols such as IPv4 and
IPv6 are also governed by standards. The original standards may or may not have included a
field that could be used to define the priority of the data, but more on that later.

To understand the QoS parameters used for VoIP, it is vital to first understand the different
network layers within which these frames and packets are operating. Some parameters are
relevant within Layer 2 and others are relevant within Layer 3. The following diagram
summarises it:

Layer 2 QoS settings


The early standards for Ethernet defined a relatively simple structure for each frame that did not
include a field for priority. An optional additional field, 4 bytes long, was introduced by the IEEE
802.1 working group (see 802.1D, 802.1Q and 802.1p). This optional field may sometimes be
loosely referred to as the VLAN tag. Two bytes of the VLAN tag are assigned for Tag Control
Information comprising a 3-bit Priority Code Point, a 1-bit CFI flag and a 12-bit VLAN ID. The
field that we are interested in is the Priority Code Point (PCP) because it is used for VoIP QoS
tagging in layer 2.
Priority values that may be assigned to the PCP field are defined in the 802.1p standard. Values
can be between 0 and 7, but remember this value is used for traffic classification and thus
represents more than a simple linear scale. The audio stream, or RTP, for a VoIP call requires low
latency and should be assigned a priority value of 5. The signalling traffic, or SIP, can tolerate
higher latency so it is usually assigned a priority value of 3. Sometimes the layer 2 priority value
is referred to as a CoS value (Class of Service) although this terminology has the potential to be
ambiguous.
Cautionary note:
Layer 2 802.1p QoS settings are stored in the VLAN tag which is not always present by default.
Please take care when enabling or setting a VLAN ID on your VoIP equipment as it may result in
a complete loss of remote connectivity. You should be able to recover from such problems by
disabling VLAN options using the phones built-in menus and LCD display, but in the worst case

you may have to return it to default settings using a factory reset. If unsure, a VLAN ID of zero
should (in theory) be recognised by non-VLAN enabled network equipment.
Aastra settings
On the Aastra IP phones, the layer 2 settings are shown on the web GUI within the Network
Settings page under the heading VLAN. It will only allow you to modify the values if you tick
the VLAN Enable box, which makes sense as the priority value is stored in the VLAN tag. You
can set a different priority for the SIP messages, the RTP stream and the RTCP messages, but the
default values already shown (3 for SIP, 5 for RTP and RTCP) will not normally need to be
changed.
Linksys settings
On Cisco and Linksys phones, the layer 2 settings are shown on the Line tab under the heading
Network Settings. Look for the fields labelled SIP CoS Value and RTP CoS value.
Snom settings
On my Snom 360 phone, some settings are available in the QoS/Security tab of the Advanced
settings page. It allows you to specify a Priority to be associated with each VLAN ID. There are
also settings for VLAN ID and Priority for the Net Port and the PC Port, but you cannot set
different priorities for SIP, RTP and RTCP.
Grandstream settings
Taking two different Grandstream phones, I found the QoS settings under the Advanced
Settings tab on the older GXP phone and under Maintenance on the newer GXV phone. On
the latter they are shown in the Network Settings sub-page. Identification is simple because
Grandstream call them Layer 2 QoS 802.1Q/VLAN Tag and Layer 2 QoS 802.1p Priority
Value. Nice unambiguous labelling.
Asterisk settings
In recent versions of Asterisk, you can set the value for the layer 2 PCP field by editing the
sip.conf file. The values are set using the following parameters (here shown with their default
values):
cos_sip=3
cos_audio=5
cos_video=4
cos_text=3

Further reading
In part 2, I look at QoS settings for prioritising network traffic using the Layer 3 IP protocol.

If QoS is to make a difference, it requires a good deal more than just setting the DSCP, ToS or
CoS values on the IP telephony device or PBX. If QoS is an important requirement in your VoIP
projects, I strongly recommend that you read my follow-up article about QoS in practice. Just
click this link to go to the article.

Feedback on this article


Please take a moment to use the coloured voting buttons below to provide some feedback to the
author. If you liked the article, great. If it fell a little short of your expectations, please leave a
brief comment or send me an email (info (at) smartvox.co.uk) so I know what needs changing.
Thanks.
VoIP QoS Settings part 2

by Smartvox on July 29, 2011


Ads by Google:

In part 1, we examined the Layer 2 QoS settings available on most VoIP equipment. In this
second part, I will explore the Layer 3 parameters and offer practical suggestions for the values
that should be assigned to them. We will briefly look at the history and structure of the ToS and
DSCP fields and their place within the DiffServ packet prioritisation model.
Recap

I explained in part 1 how Layer 2 data can be tagged with a priority value (sometimes called
CoS). The L2 priority value is stored in the 3-bit PCP field of the Ethernet frame and this field is
part of an optional 4-byte section loosely referred to as the VLAN tag.
Completely separate to this, it is also possible to tag IP packets within Layer 3 in a way that
allows routers and switches to recognise that some packets should be given a higher priority than
others. The following diagram illustrates how the different mechanisms exist at different layers
in the OSI model:

Layer 3 QoS settings

As the above diagram shows, the QoS mechanisms that operate at L3 are DiffServ, DSCP and
ToS. These L3 mechanisms have the potential to operate over the Internet thereby giving them a
wider reach than L2 (which normally only operates within the scope of a LAN or private
network). Theoretically, L3 QoS prioritisation could be applied all the way through to the remote
end-point in a VoIP connection. However, in practice you are unlikely to be able to guarantee that
routing equipment outside your control will respect the QoS tagging that has been applied to the
IP packets.
The ToS field and how it has changed over time

The original design for the L3 Internet Protocol (see RFC 791) defined a standard structure for
Internet Datagram Headers that included an 8 bit Type of Service (ToS) field. Its purpose was
to provide storage for QoS parameters to allow network equipment to prioritise certain traffic at
times of high load. Three of the eight bits were set aside for an IP Precedence value (very
similar to the PCP priority value described in part 1) and a further three bits were designated as
facility flags to indicate if a packet had preference for low delay, high throughput or high
reliability. The last two bits were reserved for future use.
In 1998 and 2001, new standards were published (see RFC 2474, RFC 2475 and RFC 3168)
which redefined the 8-bit ToS field to allow its use for the so-called Differentiated Services.
The name ToS continues to be used today, even though the original interpretation given in
RFC 791 has been superseded by the newer Differentiated Services model and the name DS
field should be used instead. The 8-bit DS field comprises the first 6 bits as a DSCP value and
the last 2 bits which are used for Explicit Congestion Notification (ECN) data:

The above
diagram is copied directly from RFC 3168 which describes ECN.
Differentiated Services and the DSCP value

While the basic concepts are straightforward, the practical implementation especially the
terminology and labelling associated with the Differentiated Services Field is remarkably
confusing. As a VoIP engineer, you probably just want to know what value you should set for one
parameter on the VoIP equipments setup menus. You dont want to have to read two articles in
Wikipedia and a couple of RFCs just to be able to set that one value.
Lets try to unpick the various points of confusion as briefly as possible:

DiffServ is a broad term covering the architecture and mechanisms of this


particular approach to QoS. There are other models that can be used too for
example IntServ and RSVP.

Differentiated Services Code Point (DSCP) is a 6-bit value within an 8-bit field
in the IP header.

The 8-bit field is sometimes called the ToS or TOS field. It may also be called
the DS field or DSF.

The first 3 bits of the DSCP field may be referred to as the Class Selector.
The same bits in the ToS field definition (in RFC791) were assigned to the socalled IP Precedence value. Some values relevant to VoIP are shown below.

The last 2 bits of the 8-bit field, used for congestion notification, are set
dynamically by routers. You can assume 00 whenever you are obliged to
explicitly assign a value to them.

When specifying a value for DSCP on your VoIP equipment, be careful to


check if it expects a 6-bit value or an 8-bit value, if it requires the value to be
given in hex or in decimal or as a named constant.

Wow. So what is that last point saying what do I mean by a named constant? The answer
comes from the fact 6 bits allows for a lot of different values. The 6 bits can in fact be further
sub-divided into elements that describe the Class Selector and the Drop Probability. To make

these values more easily readable by humans, they are often described by an abbreviated name
indicating the Per Hop Behaviour, the Class Selector value and Drop Probability. Here are
some examples:

AF31 where AF=Assured Forwarding, 3 indicates Class 3 and 1 is the Drop


Probability

EF Expedited Forwarding recommended for RTP

CS5 Class Selector 5

A selection of ToS IP Precedence values and their equivalent Class Selector value:

IP Precedence 0 (CS0) Routine or Best Effort typically used for data

IP Precedence 3 (CS3) Flash used for voice signalling (e.g. SIP)

IP Precedence 5 (CS5) Critical used for RTP

I am not going to attempt to provide a complete list or even a complete explanation of the DSCP
values there is plenty of existing material on the Internet that does this already. However, you
may want to try the following links:
Excellent overview of QoS includes a table of DSCP values: http://www.rhyshaden.com/qos.htm
List of commonly used DSCP values as binary, decimal and named constant:
http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0/qos/configuration/guid
e/qos_6dscp_val.pdf
A handy table for converting between DSCP and ToS; giving values in binary, hex and decimal:
http://bytesolutions.com/Support/Knowledgebase/KB_Viewer/ArticleId/34/DSCP-TOS-CoSPresidence-conversion-chart.aspx
What values should be used on VoIP equipment?

A sensible choice for SIP signalling messages is AF31. This requires the ToS field to be set to
068 (hex) or 104 (decimal). CS3 is also suitable for SIP.
A sensible choice for RTP and RTCP is EF. This requires the ToS field to be set to 0xB8 (hex) or
184 (decimal)
Unfortunately, different manufacturers require different formats for entry of the data.

Linksys

On my Linksys devices, the parameters are located in the Line tab in a section entitled Network
Settings (you need to login as admin and select advanced viewing). The values are described as
SIP ToS/DiffServ and RTP ToS/DiffServ and they encompass the whole 8-bit ToS field.
They are entered as Hex values and the defaults are 068 for SIP and 0xb8 for RTP, which are
equivalent to AF31 for SIP and EF for RTP.
Snom

On the Snom 360 the settings are available in the QoS/Security tab of the Advanced settings
page. The Layer 3 settings are shown as two TOS/Diffserv values, one for RTP and another for
SIP. The values are entered as decimal numbers which are internally converted to an 8-bit value
for the whole ToS field. The default on the Snom for both SIP and RTP is 160 which is
equivalent to 10100000 in binary. To see what DSCP value this equals, you must strip off the last
two zeroes (the ECN field), leaving the binary value 101000. From tables we can determine that
this binary value is called CS5 when using the named constants.
Aastra

On the Aastra IP phones, the layer 3 settings are shown on the web GUI within the Network
Settings page under the heading Type of Service DSCP. The values are entered as decimal
numbers, but unlike the previous manufacturers they represent only a 6-bit DSCP field (the 2-bit
ECN field is not included). The default values are 26 for SIP and 46 for RTP/RTCP equivalent
to AF31 and EF.
Grandstream

The parameter is called Layer 3 QoS and it expects a value, entered as a decimal number, that
is equivalent to the 6-bit DSCP field. The default on my GXV3140 is set to 48 which is
equivalent to the named constant CS6 i.e. Class Selector 6. It does not allow different values to
be specified for SIP and RTP.
Asterisk settings

In recent versions of Asterisk, you can set the value for the layer 3 DSCP field by editing the
sip.conf file. The parameter names refer to the ToS field, but the values you specify are named
constants for the DSCP field, so even the developers who work on Asterisk have trouble
understanding the terminology! The values are set using the following parameters (here shown
with their default values):
tos_sip=cs3
tos_audio=ef
tos_video=af41
tos_text=af41

Further reading

;
;
;
;

Sets
Sets
Sets
Sets

TOS
TOS
TOS
TOS

for
for
for
for

SIP
RTP
RTP
RTP

packets.
audio packets.
video packets.
text packets.

If QoS is to make a difference, it requires a good deal more than just setting the DSCP, ToS or
CoS values on the IP telephony device or PBX. If QoS is an important requirement in your VoIP
projects, I strongly recommend that you read my follow-up article about QoS in practice. Just
click this link to go to the article.
Feedback on this article

Please take a moment to use the coloured voting buttons below to provide some feedback to the
author. If you liked the article, great. If it fell a little short of your expectations, please leave a
brief comment or send me an email (info (at) smartvox.co.uk) so I know what needs changing.
Thanks.

RTP, Jitter and audio quality in VoIP


by Smartvox on April 24, 2012
Ads by Google:
In this article we will briefly look at what RTP is and how it is used to stream VoIP audio. The
article then considers how certain network transmission characteristics may introduce jitter or
packet loss and the measures that are used in VoIP equipment to mitigate the effects. Other
phenomenon which have a bearing on the audio quality on VoIP calls, along with the features
used on VoIP equipment to overcome them, are also briefly discussed.

RTP
RTP is Real-time Transport Protocol. It is a general purpose protocol for the streaming of audio,
video or any similar data over IP networks. In a VoIP call, each RTP packet carries a small
sample of audio (typically 20 or 30ms) which is constructed by the sending device from
analogue signals picked up by the microphone in the phones handset.

Within the RTP protocol, each packet must be numbered and time-stamped. This has to be done
by the source device the one that is sending the packets.

The presence of sequence numbers and time stamps allows the receiving device to inspect the
packet headers and determine if the packets are arriving in the correct sequence, with constant or
varying delay or if any are missing.

RTCP
RTCP is RTP Control Protocol. The protocol is used alongside RTP to provide reporting of the
quality of the RTP stream being received at the far end of a connection. The RTCP packets are
sent from time to time in the reverse direction of the RTP packets. RTCP packets contain data
describing the quality of the RTP stream being received. They are sent to the sending equipment
so it can know how good or bad the audio quality is at the other end of the line. Asterisk has
some limited capabilities for users to view audio quality information at the command line. For
example, you can try the commands sip show channelstats and rtcp set stats on|off. The
following article talks a bit about call quality in the context of RTCP reports.
http://www.voip-info.org/wiki/view/Asterisk+RTCP

Jitter and packet loss


Jitter is all about the timing and the sequence of the arriving RTP packets. If they arrive in a nice
steady stream at regular intervals in the correct sequence then you have low jitter. If they arrive
in bursts interspersed with gaps, or if they arrive out of sequence, then you have high jitter. Jitter
happens when the RTP packet stream traverses the network (LAN, WAN or Internet) because it
has to share network capacity with other data. The following diagram illustrates how jitter can be
created.

By the way, QoS (DSCP) is a way of marking packets so the intermediate network equipment is
aware of their relative importance. QoS packet tagging allows the network equipment to
prioritise one type of packet over another. For example, pushing newly received RTP packets
through to the output interface in preference to other data packets, even if the other packets
arrived first.
The following diagram illustrates what our original stream of RTP packets might look like after
they have traversed the network, become jittered and arrived at the receiving equipment.

The variation in packet delay is generally referred to as jitter, although a more accurate
description of this phenomenon is Packet Delay Variation (PDV).
http://en.wikipedia.org/wiki/Packet_delay_variation
The sequence numbering of RTP packets allows a receiving device at the far end to check if the
packets are still in the correct sequence or if any are missing. Packets can get out of sequence if
they take different routes over the network. Packets can be dropped if there is network
congestion somewhere along the route or if there are network errors.

Jitter buffers
A jitter buffer is used at the receiving equipment to store incoming RTP packets, re-align them in
terms of timing and check they are in the correct order. If some arrive slightly out-of-sequence
then, provided it is large enough, the jitter buffer can put them back into the right sequence.
However, for this to work the receiving device must delay the audio very slightly while it checks
and reassembles the packet stream.
If a packet was dropped (or simply does not arrive in time) then the receiving device has
somehow to fill in the gap using a process known as Packet Loss Concealment or PLC.
Packet loss needs to be less than 1% if it is not to have too great an impact on call audio quality.
Greater than 3% would certainly be noticeable as a degradation of quality.
Even if the RTP packets remain in the correct sequence and there is zero packet loss, large
variations in the end-to-end transmission time for the packets may cause degradation of audio
quality that can only really be fixed through the use of a jitter buffer.

Jitter buffers in Asterisk


Jitter buffering is not enabled in the default Asterisk configuration files. Enabling them is not as
simple as you would hope because their activation is conditional on a number of different
factors. First, you must enable the jitter buffers in the conf file relevant to the appropriate leg of
your bridged calls. Typically this means in chan_dahdi.conf where you are using Asterisk to
bridge between SIP and TDM circuits (you would think it would be in sip.conf, but apparently
not). For SIP-to-SIP calls, Asterisk will often just let the jittered packets be forwarded as
received, leaving it for the downstream end-point to de-jitter the stream. If you want to learn
more about activation of Asterisk jitter buffers and PLC, look for chapter 15 in asterisk.pdf
which you will find in the source install sub-directory /doc/tex.

Latency
Latency is simply a measure of the delay and it is measured in millisecond. Less than 140ms is
almost undetectable to the human ear. Somewhere between 150ms and 200ms it begins to

become perceptible and as the latency gets greater so it becomes more noticeable and more
annoying.
There are several potential causes for latency (one of which is the use of large jitter buffers).
Conversion between different codecs and the technology required to join SIP to TDM (or vice
versa) will introduce small delays. There will also be a measurable time delay for packets to
traverse the network ping can be used to give a rough indication of the round trip time (time
for a packet to get there and back), but ping is only a crude measure because network latency is
influenced by packet type, packet size, QoS settings and by how congested the network is at any
given moment.

Echo
Echo occurs when a user hears their own speech coming back to them and the total latency
(including the return path) exceeds 150ms. If the time delay is low enough then the sound of
your own voice does not cause much of a problem even at relatively high amplitude. In fact,
some return sound in the earpiece of a phone is generally a good thing because it makes the
handset feel live. This acceptable feedback is called sidetone in the telephony industry.
There is a lot to say about echo and echo cancellation so I will publish a more comprehensive
discussion of the topic in a future article.

Silence suppression, VAD and CNG


Silence suppression is a mechanism primarily designed to reduce network bandwidth demands,
allowing VoIP equipment to send far less RTP data when the caller is not talking. The mechanism
is defined in RFC3389. The source device uses Voice Activity Detection (VAD) to detect when
the caller is speaking. During pauses in the speech it does not send audio samples in the RTP
packets, but instead sends a special instruction showing that silence started or ended.
Ideally, the receiving device then needs to be able to regenerate suitable background noise to
replace the missing audio a mechanism called Comfort Noise Generation (CNG). Without
CNG, the listener might find it very disconcerting to hear complete silence when the person at
the other end of the line is not talking.

Echo suppressors and Noise cancelling microphones


Dont confuse silence suppression and echo suppressors as they are not the same thing. An echo
suppressor is used to reduce or prevent acoustic feedback on a speaker-phone. It does this by
automatically reducing the microphone sensitivity whenever sound is coming out of the speaker.
Conversely, it should reduce the speaker volume when there is no speech being played through
the speaker. In effect it makes a speaker-phone operate in such a way that either the callers voice
can be heard from the speaker or the local users voice is being picked up and transmitted by the
microphone never both at the same time.

A noise cancelling microphone is subtly different again. It automatically reduces its sensitivity
when the user is not talking in terms of gain characteristics it makes quiet things quieter and
loud things louder. The idea is that you would use a noise cancelling microphone in a noisy
environment like a call centre. When the agent stops speaking, the microphone rapidly adjusts
itself to a lower sensitivity so it wont pick up too much background noise. As soon as the agent
starts speaking agin, the sensitivity will increase back to normal thereby making both the agents
voice and the background noise louder.

MOS
MOS is a measure of audio quality that directly relates to the caller experience. It stands for
Mean Opinion Score and was originally based on subjective opinions of people using a phone in
controlled conditions. Now, in the VoIP world, it is used as a pseudo-objective measure allowing
different levels of audio quality and speech clarity to be compared. A MOS of 3.0 is fair and of
4.0 is good. Wikipedia has a good article on this topic and the link is given below. It is interesting
to note that the G729a codec can only deliver a MOS of about 3.9 whereas using G711 it can be
as high as 4.5.
http://en.wikipedia.org/wiki/Mean_opinion_score

Das könnte Ihnen auch gefallen