Beruflich Dokumente
Kultur Dokumente
A network is a mesh of interconnected nodes. The nodes are pieces of network equipment, like
routers and Ethernet switches, and the interconnections could be fibre optic or Ethernet (copper)
cables.
Lets be clear, the data being transmitted over a single copper pair or fibre-optic cable all travel at
the same speed you do not get frames overtaking each other! So the points at which network
congestion can occur are the nodes. Congestion happens at a node when the rate of ingress of
data exceeds the rate at which the data can be forwarded to the next destination, the next node.
There are potentially several ways that the network equipment could deal with congestion and
the actual method will depend in part on the capabilities of the equipment itself. For example, it
might be possible for a downstream node to signal to the upstream node to stop sending for a
while, to pause. This method is referred to as flow control.
Another option, if the router is able to detect congestion in a downstream route, then it might be
able to send data via a different, less congested route. This would require quite a sophisticated
device that is able to learn about alternative routes and evaluate them to choose the preferred
route under different circumstances of network congestion. It would need to be able to read and
interpret the Explicit Congestion Notifications coming back to it in the Layer 3 packets.
The mechanism that is of most interest to us now, however, is the prioritisation of packets or
frames within the network equipment (the devices at the network nodes). As we will see, this is
dependent on the presence of memory-based data buffers within the routing equipment, the
management of virtual queues and algorithms that determine how data are assigned to and
removed from the queues.
Flow control
Flow control is generally only possible with a full duplex interconnection. A flow control
mechanism may take the form of a PAUSE Frame in Layer 2 or an ICMP Source Quench
request in Layer 3. The concept is illustrated below:
However, flow control is not a complete solution and is not always possible. Consider, for
example, the case of a media stream carrying voice or video such data is time-critical and
cannot therefore tolerate significant interruptions. Indeed, if flow control is not going to simply
push the problem back onto other upstream nodes, then the flow control signal would have to be
sent all the way back to the original source that is sending the data. That may not be possible and,
even if it is, the source may simply not be able to pause the ultimate source for an audio media
stream is the person speaking and they are hardly likely to be controlled by a flow control signal
in the network.
Memory buffers
No matter which mechanisms are used to manage congestion, there is an inescapable necessity
for the network equipment at each node to have some form of memory-based buffering.
Buffering allows it to receive data and decide how to handle it before sending it on to the next
node. Without buffering, any situation where data are being received at the equipment faster than
they can be transmitted out the other end has only one option: the excess must be discarded (so
called packet loss).
All network equipment, beyond a simple hub, must therefore have some memory buffering.
Furthermore, the core mechanism for discrimination and prioritisation of network traffic works
by using these internal buffers to queue the data.
On a very basic router or switch the memory buffer would be quite small without being subdivided it would behave as a simple queue where new data are added to the back of the queue
while the older data at the front of the queue gets processed (so-called FIFO). On the more
sophisticated and expensive boxes the buffers are larger and are sub-divided into multiple queues
as new data packets arrive they are assigned to the back of the most appropriate queue
depending on their QoS settings.
The software running on the router takes packets from the front of each queue in a pre-defined
way that allows some queues to have a higher priority than others. The algorithms used to
process these queues have to be clever enough to take account of preferences for low latency and
risk of packet loss basically trying to prioritise some queues while avoiding the risk that the
lowest priority queues never get a look-in. If the buffers are getting full, flow control would be
attempted. Failing that, when packets have to be discarded, the algorithms might look at the QoS
settings to determine the drop priority and thereby choose which ones to drop.
While buffering of data is essential and is integral to the prioritisation of network traffic and the
general management of bandwidth, it is not a panacea. In fact, the more that buffering is used,
the greater the latency (delay). Furthermore, while buffering may help to avoid packet loss, it
cannot prevent it if the ingress rate at a node exceeds the outflow rate for a long period of time.
In effect, the buffering just helps to smooth out short-term peaks by time-shifting the excess
input at one point in time to a quieter period a little later.
The shaded area of the graph, where inbound transmission rates exceed the maximum outflow
rate, equates to data that must be temporarily put into the buffer. Only when the inbound rate
drops below the dotted red line is it possible for the equipment to empty its buffers. If the rate
stays above the red line for too long, then the buffers would become full and packets would have
to be dropped.
The implications for practical application of VoIP QoS
We need to draw some conclusions from this exploration of network congestion issues. What
does it mean in practical terms for VoIP system designers and IP-PBX installers? Based on the
above points and also drawing upon my own experience of real-world installations, I would
suggest that the following conclusions can be drawn:
Network congestion happens at nodes where the data ingress rate exceeds
the outflow rate. If this speed disparity exists for too long, then packet loss
is likely to occur. If there is no congestion then QoS is more or less irrelevant.
management will normally allow lower priority data to borrow some of the
reserved bandwidth when the high priority traffic is not using it.
Just using QoS settings to mark packets or frames as high priority as they
leave your IP phone or IP-PBX, guarantees absolutely nothing about how they
will be handled.
Once network traffic goes outside your own network infrastructure it is quite
likely that QoS tags will be overwritten or ignored. This is especially true
when traffic leaves your premises to traverse the Internet using an ordinary
broadband connection. If you need end-to-end real time QoS over a
broadband Internet connection you will need to look closely at the packages
being offered by different service providers and you must expect to pay a
premium for a broadband connection that supports it.
In most cases, QoS is far more relevant to transmissions in one direction than
in the other. For example, if network congestion is happening at a node
because the outbound connection is slower than the inward one then the
chances are that for data travelling the other way there will be no problem.
Furthermore, you may be able to set the QoS tags on packets you are
sending, but you may not be able to set it on the ones you are receiving.
What is QoS?
QoS (Quality of Service) is a somewhat all-encompassing term and means different things to
different people, even within the context of Voice over IP. The broader issues of audio quality on
VoIP calls, echo cancellation and the mechanisms within end-points designed to compensate for
packet loss and jitter are discussed in a different article here. The QoS settings that this article
seeks to explain are all directly related to frame and packet tagging for prioritisation of network
traffic. Even restricting ourselves to this specific area, there is enough material to fill at least one
decent sized book. In order to keep it to a manageable scale for this article Im going to have to
brush over some of the details very quickly. I apologise in advance if some experts out there find
my condensed descriptions over-simplistic or inaccurate.
To understand the QoS parameters used for VoIP, it is vital to first understand the different
network layers within which these frames and packets are operating. Some parameters are
relevant within Layer 2 and others are relevant within Layer 3. The following diagram
summarises it:
you may have to return it to default settings using a factory reset. If unsure, a VLAN ID of zero
should (in theory) be recognised by non-VLAN enabled network equipment.
Aastra settings
On the Aastra IP phones, the layer 2 settings are shown on the web GUI within the Network
Settings page under the heading VLAN. It will only allow you to modify the values if you tick
the VLAN Enable box, which makes sense as the priority value is stored in the VLAN tag. You
can set a different priority for the SIP messages, the RTP stream and the RTCP messages, but the
default values already shown (3 for SIP, 5 for RTP and RTCP) will not normally need to be
changed.
Linksys settings
On Cisco and Linksys phones, the layer 2 settings are shown on the Line tab under the heading
Network Settings. Look for the fields labelled SIP CoS Value and RTP CoS value.
Snom settings
On my Snom 360 phone, some settings are available in the QoS/Security tab of the Advanced
settings page. It allows you to specify a Priority to be associated with each VLAN ID. There are
also settings for VLAN ID and Priority for the Net Port and the PC Port, but you cannot set
different priorities for SIP, RTP and RTCP.
Grandstream settings
Taking two different Grandstream phones, I found the QoS settings under the Advanced
Settings tab on the older GXP phone and under Maintenance on the newer GXV phone. On
the latter they are shown in the Network Settings sub-page. Identification is simple because
Grandstream call them Layer 2 QoS 802.1Q/VLAN Tag and Layer 2 QoS 802.1p Priority
Value. Nice unambiguous labelling.
Asterisk settings
In recent versions of Asterisk, you can set the value for the layer 2 PCP field by editing the
sip.conf file. The values are set using the following parameters (here shown with their default
values):
cos_sip=3
cos_audio=5
cos_video=4
cos_text=3
Further reading
In part 2, I look at QoS settings for prioritising network traffic using the Layer 3 IP protocol.
If QoS is to make a difference, it requires a good deal more than just setting the DSCP, ToS or
CoS values on the IP telephony device or PBX. If QoS is an important requirement in your VoIP
projects, I strongly recommend that you read my follow-up article about QoS in practice. Just
click this link to go to the article.
In part 1, we examined the Layer 2 QoS settings available on most VoIP equipment. In this
second part, I will explore the Layer 3 parameters and offer practical suggestions for the values
that should be assigned to them. We will briefly look at the history and structure of the ToS and
DSCP fields and their place within the DiffServ packet prioritisation model.
Recap
I explained in part 1 how Layer 2 data can be tagged with a priority value (sometimes called
CoS). The L2 priority value is stored in the 3-bit PCP field of the Ethernet frame and this field is
part of an optional 4-byte section loosely referred to as the VLAN tag.
Completely separate to this, it is also possible to tag IP packets within Layer 3 in a way that
allows routers and switches to recognise that some packets should be given a higher priority than
others. The following diagram illustrates how the different mechanisms exist at different layers
in the OSI model:
As the above diagram shows, the QoS mechanisms that operate at L3 are DiffServ, DSCP and
ToS. These L3 mechanisms have the potential to operate over the Internet thereby giving them a
wider reach than L2 (which normally only operates within the scope of a LAN or private
network). Theoretically, L3 QoS prioritisation could be applied all the way through to the remote
end-point in a VoIP connection. However, in practice you are unlikely to be able to guarantee that
routing equipment outside your control will respect the QoS tagging that has been applied to the
IP packets.
The ToS field and how it has changed over time
The original design for the L3 Internet Protocol (see RFC 791) defined a standard structure for
Internet Datagram Headers that included an 8 bit Type of Service (ToS) field. Its purpose was
to provide storage for QoS parameters to allow network equipment to prioritise certain traffic at
times of high load. Three of the eight bits were set aside for an IP Precedence value (very
similar to the PCP priority value described in part 1) and a further three bits were designated as
facility flags to indicate if a packet had preference for low delay, high throughput or high
reliability. The last two bits were reserved for future use.
In 1998 and 2001, new standards were published (see RFC 2474, RFC 2475 and RFC 3168)
which redefined the 8-bit ToS field to allow its use for the so-called Differentiated Services.
The name ToS continues to be used today, even though the original interpretation given in
RFC 791 has been superseded by the newer Differentiated Services model and the name DS
field should be used instead. The 8-bit DS field comprises the first 6 bits as a DSCP value and
the last 2 bits which are used for Explicit Congestion Notification (ECN) data:
The above
diagram is copied directly from RFC 3168 which describes ECN.
Differentiated Services and the DSCP value
While the basic concepts are straightforward, the practical implementation especially the
terminology and labelling associated with the Differentiated Services Field is remarkably
confusing. As a VoIP engineer, you probably just want to know what value you should set for one
parameter on the VoIP equipments setup menus. You dont want to have to read two articles in
Wikipedia and a couple of RFCs just to be able to set that one value.
Lets try to unpick the various points of confusion as briefly as possible:
Differentiated Services Code Point (DSCP) is a 6-bit value within an 8-bit field
in the IP header.
The 8-bit field is sometimes called the ToS or TOS field. It may also be called
the DS field or DSF.
The first 3 bits of the DSCP field may be referred to as the Class Selector.
The same bits in the ToS field definition (in RFC791) were assigned to the socalled IP Precedence value. Some values relevant to VoIP are shown below.
The last 2 bits of the 8-bit field, used for congestion notification, are set
dynamically by routers. You can assume 00 whenever you are obliged to
explicitly assign a value to them.
Wow. So what is that last point saying what do I mean by a named constant? The answer
comes from the fact 6 bits allows for a lot of different values. The 6 bits can in fact be further
sub-divided into elements that describe the Class Selector and the Drop Probability. To make
these values more easily readable by humans, they are often described by an abbreviated name
indicating the Per Hop Behaviour, the Class Selector value and Drop Probability. Here are
some examples:
A selection of ToS IP Precedence values and their equivalent Class Selector value:
I am not going to attempt to provide a complete list or even a complete explanation of the DSCP
values there is plenty of existing material on the Internet that does this already. However, you
may want to try the following links:
Excellent overview of QoS includes a table of DSCP values: http://www.rhyshaden.com/qos.htm
List of commonly used DSCP values as binary, decimal and named constant:
http://www.cisco.com/en/US/docs/switches/datacenter/nexus1000/sw/4_0/qos/configuration/guid
e/qos_6dscp_val.pdf
A handy table for converting between DSCP and ToS; giving values in binary, hex and decimal:
http://bytesolutions.com/Support/Knowledgebase/KB_Viewer/ArticleId/34/DSCP-TOS-CoSPresidence-conversion-chart.aspx
What values should be used on VoIP equipment?
A sensible choice for SIP signalling messages is AF31. This requires the ToS field to be set to
068 (hex) or 104 (decimal). CS3 is also suitable for SIP.
A sensible choice for RTP and RTCP is EF. This requires the ToS field to be set to 0xB8 (hex) or
184 (decimal)
Unfortunately, different manufacturers require different formats for entry of the data.
Linksys
On my Linksys devices, the parameters are located in the Line tab in a section entitled Network
Settings (you need to login as admin and select advanced viewing). The values are described as
SIP ToS/DiffServ and RTP ToS/DiffServ and they encompass the whole 8-bit ToS field.
They are entered as Hex values and the defaults are 068 for SIP and 0xb8 for RTP, which are
equivalent to AF31 for SIP and EF for RTP.
Snom
On the Snom 360 the settings are available in the QoS/Security tab of the Advanced settings
page. The Layer 3 settings are shown as two TOS/Diffserv values, one for RTP and another for
SIP. The values are entered as decimal numbers which are internally converted to an 8-bit value
for the whole ToS field. The default on the Snom for both SIP and RTP is 160 which is
equivalent to 10100000 in binary. To see what DSCP value this equals, you must strip off the last
two zeroes (the ECN field), leaving the binary value 101000. From tables we can determine that
this binary value is called CS5 when using the named constants.
Aastra
On the Aastra IP phones, the layer 3 settings are shown on the web GUI within the Network
Settings page under the heading Type of Service DSCP. The values are entered as decimal
numbers, but unlike the previous manufacturers they represent only a 6-bit DSCP field (the 2-bit
ECN field is not included). The default values are 26 for SIP and 46 for RTP/RTCP equivalent
to AF31 and EF.
Grandstream
The parameter is called Layer 3 QoS and it expects a value, entered as a decimal number, that
is equivalent to the 6-bit DSCP field. The default on my GXV3140 is set to 48 which is
equivalent to the named constant CS6 i.e. Class Selector 6. It does not allow different values to
be specified for SIP and RTP.
Asterisk settings
In recent versions of Asterisk, you can set the value for the layer 3 DSCP field by editing the
sip.conf file. The parameter names refer to the ToS field, but the values you specify are named
constants for the DSCP field, so even the developers who work on Asterisk have trouble
understanding the terminology! The values are set using the following parameters (here shown
with their default values):
tos_sip=cs3
tos_audio=ef
tos_video=af41
tos_text=af41
Further reading
;
;
;
;
Sets
Sets
Sets
Sets
TOS
TOS
TOS
TOS
for
for
for
for
SIP
RTP
RTP
RTP
packets.
audio packets.
video packets.
text packets.
If QoS is to make a difference, it requires a good deal more than just setting the DSCP, ToS or
CoS values on the IP telephony device or PBX. If QoS is an important requirement in your VoIP
projects, I strongly recommend that you read my follow-up article about QoS in practice. Just
click this link to go to the article.
Feedback on this article
Please take a moment to use the coloured voting buttons below to provide some feedback to the
author. If you liked the article, great. If it fell a little short of your expectations, please leave a
brief comment or send me an email (info (at) smartvox.co.uk) so I know what needs changing.
Thanks.
RTP
RTP is Real-time Transport Protocol. It is a general purpose protocol for the streaming of audio,
video or any similar data over IP networks. In a VoIP call, each RTP packet carries a small
sample of audio (typically 20 or 30ms) which is constructed by the sending device from
analogue signals picked up by the microphone in the phones handset.
Within the RTP protocol, each packet must be numbered and time-stamped. This has to be done
by the source device the one that is sending the packets.
The presence of sequence numbers and time stamps allows the receiving device to inspect the
packet headers and determine if the packets are arriving in the correct sequence, with constant or
varying delay or if any are missing.
RTCP
RTCP is RTP Control Protocol. The protocol is used alongside RTP to provide reporting of the
quality of the RTP stream being received at the far end of a connection. The RTCP packets are
sent from time to time in the reverse direction of the RTP packets. RTCP packets contain data
describing the quality of the RTP stream being received. They are sent to the sending equipment
so it can know how good or bad the audio quality is at the other end of the line. Asterisk has
some limited capabilities for users to view audio quality information at the command line. For
example, you can try the commands sip show channelstats and rtcp set stats on|off. The
following article talks a bit about call quality in the context of RTCP reports.
http://www.voip-info.org/wiki/view/Asterisk+RTCP
By the way, QoS (DSCP) is a way of marking packets so the intermediate network equipment is
aware of their relative importance. QoS packet tagging allows the network equipment to
prioritise one type of packet over another. For example, pushing newly received RTP packets
through to the output interface in preference to other data packets, even if the other packets
arrived first.
The following diagram illustrates what our original stream of RTP packets might look like after
they have traversed the network, become jittered and arrived at the receiving equipment.
The variation in packet delay is generally referred to as jitter, although a more accurate
description of this phenomenon is Packet Delay Variation (PDV).
http://en.wikipedia.org/wiki/Packet_delay_variation
The sequence numbering of RTP packets allows a receiving device at the far end to check if the
packets are still in the correct sequence or if any are missing. Packets can get out of sequence if
they take different routes over the network. Packets can be dropped if there is network
congestion somewhere along the route or if there are network errors.
Jitter buffers
A jitter buffer is used at the receiving equipment to store incoming RTP packets, re-align them in
terms of timing and check they are in the correct order. If some arrive slightly out-of-sequence
then, provided it is large enough, the jitter buffer can put them back into the right sequence.
However, for this to work the receiving device must delay the audio very slightly while it checks
and reassembles the packet stream.
If a packet was dropped (or simply does not arrive in time) then the receiving device has
somehow to fill in the gap using a process known as Packet Loss Concealment or PLC.
Packet loss needs to be less than 1% if it is not to have too great an impact on call audio quality.
Greater than 3% would certainly be noticeable as a degradation of quality.
Even if the RTP packets remain in the correct sequence and there is zero packet loss, large
variations in the end-to-end transmission time for the packets may cause degradation of audio
quality that can only really be fixed through the use of a jitter buffer.
Latency
Latency is simply a measure of the delay and it is measured in millisecond. Less than 140ms is
almost undetectable to the human ear. Somewhere between 150ms and 200ms it begins to
become perceptible and as the latency gets greater so it becomes more noticeable and more
annoying.
There are several potential causes for latency (one of which is the use of large jitter buffers).
Conversion between different codecs and the technology required to join SIP to TDM (or vice
versa) will introduce small delays. There will also be a measurable time delay for packets to
traverse the network ping can be used to give a rough indication of the round trip time (time
for a packet to get there and back), but ping is only a crude measure because network latency is
influenced by packet type, packet size, QoS settings and by how congested the network is at any
given moment.
Echo
Echo occurs when a user hears their own speech coming back to them and the total latency
(including the return path) exceeds 150ms. If the time delay is low enough then the sound of
your own voice does not cause much of a problem even at relatively high amplitude. In fact,
some return sound in the earpiece of a phone is generally a good thing because it makes the
handset feel live. This acceptable feedback is called sidetone in the telephony industry.
There is a lot to say about echo and echo cancellation so I will publish a more comprehensive
discussion of the topic in a future article.
A noise cancelling microphone is subtly different again. It automatically reduces its sensitivity
when the user is not talking in terms of gain characteristics it makes quiet things quieter and
loud things louder. The idea is that you would use a noise cancelling microphone in a noisy
environment like a call centre. When the agent stops speaking, the microphone rapidly adjusts
itself to a lower sensitivity so it wont pick up too much background noise. As soon as the agent
starts speaking agin, the sensitivity will increase back to normal thereby making both the agents
voice and the background noise louder.
MOS
MOS is a measure of audio quality that directly relates to the caller experience. It stands for
Mean Opinion Score and was originally based on subjective opinions of people using a phone in
controlled conditions. Now, in the VoIP world, it is used as a pseudo-objective measure allowing
different levels of audio quality and speech clarity to be compared. A MOS of 3.0 is fair and of
4.0 is good. Wikipedia has a good article on this topic and the link is given below. It is interesting
to note that the G729a codec can only deliver a MOS of about 3.9 whereas using G711 it can be
as high as 4.5.
http://en.wikipedia.org/wiki/Mean_opinion_score