Beruflich Dokumente
Kultur Dokumente
What is the difference between a router and a Layer 3 switch? When should I choose one over the
other?
Generally, Layer 3 switches are faster than routers, but they usually lack some of the advanced functionalities
of routers.
Specifically, a router is a device that routes the packets to their destination. What this means is that a router
analyzes the Layer 3 destination address of every packet, and devises the best next hop for it. This process
takes time, and hence every packet encounters some delay because of this.
In a Layer 3 switch, on the other hand, whenever a routing table searches for any specific destination, a cache
entry is made in a fast memory. This cache entry contains the source-destination pair and next hop address.
Once this cache entry is in place, the next packet with the same source and destination pair does not have to
go through the entire process of searching the routing table. Next hop information is directly picked up from
the cache. That's why it is called route once switch many. This way, a Layer 3 switch can route packets much
Having explained the mechanism of both a router and a Layer 3 switch, let me also tell you that router has
some advanced routing functionality, which Layer 3 switches lack. Layer 3 switches are primarily used in the
LAN environment, where you need routing. Routers are used in the WAN environment. These days lots of
people have started using layer 3 switches in WAN environment, like MPLS.
Both circuit switching and packet switching are methods of transferring data between two nodes in a
network.
Circuit switching
In a circuit-switched network, when two nodes need to communicate, a direct and continuous connection is
established between them. It carries only the one conversation, for as long as it lasts.
For example, in the old, analog phone system, a strand of copper connected a telephone to a phone line to a
switching facility. A physical link would then be made to a line to another switching station. And, from there,
it would go through a strand of copper all the way to the other phone.
Packet switching
On a packet-switched network, data is divided into chunks, or packets. The packets have headers attached to
them to identify them -- e.g., by source, destination and sequence number -- and are intermingled with the
In a typical computer data network, computers send packets to a switch, which combines them with packets
from many other computers. The combined stream may go to another switch -- an aggregation switch --
bringing together traffic from many switches or to a router that will send them on their way across the
internet to other networks. Only the first connection, from computer to switch, is dedicated; all the rest are
shared.
Comparing circuit switching and packet switching There are some key differences between the two models,
as illustrated in this chart:
An optical (photonic) network transmits information as optical rather than electronic signals: It uses light,
In a truly optical network, every router, switch and repeater would work with light only; conversion to and
from electrical impulses would only be done at a network packet's origin and destination.
Current commercial networks mix optical networking and electronic networking. Data signals are converted
from light to electronic or electronic to light multiple times. They traverse long links and high-capacity
connections within data centers and in campus and building backbones as light, but get converted to
Optical networks have three important attributes: speed, range and capacity.
Optical networking reduces latency between endpoints on the network. Where an electric current moves data
at about 10% of the speed of light -- around 18,600 miles per second or 30,000 kilometers per second --
optical signals in fiber optic cable travel 10 times faster, at the speed of light -- 186,000 miles per second or
Optical networks can also move more data across a cable at longer distances: Using electronics and copper,
speeds top out at around 100 Gbps over short distances. Fiber can move data at 100 Gbps over a single data
channel and across multi-kilometer distances -- and even further with amplification. Even greater speeds can
Because light beams do not interfere with each other, a single strand of fiber-optic cable can carry optical
signals on multiple wavelengths simultaneously, with each light beam carrying its own data content. This is
known as wavelength division multiplexing (WDM). WDM networks can pack a single cable with anywhere
from two -- called coarse wavelength division multiplexing, or CWDM -- to 160 channels -- known as dense
wavelength division multiplexing, or DWDM -- with peak capacities at above 10 terabits per second (Tbps).
Dynamic provisioning, whereby new optical channels are lit up on an existing fiber, enables network
managers to rapidly increase the capacity of their optical network. Users can get more bandwidth in days --
hours, even -- rather than the weeks or months it might take a service provider to lay copper cable.
Another optical networking technology, free space optics (FSO), uses lasers without the optical fibers,
transmitting data through the air. Although it has lower capacity than fiber-based optical networking and is
subject to interference from certain types of precipitation, FSO can provide high-capacity wireless
connectivity with very little lead time. FSO can also transmit data over longer distances than Wi-Fi and, in
some use cases, it can do so for far less money than if fiber has to be pulled.
While it is relatively easy to tap into copper cables and read the data running over them, optical signals
running over fiber are more difficult to decipher. Many organizations that need secure networks, such as
government and defense installations, make extensive use of optical networks, sometimes connecting them
right to the desktop. Newer generations of quantum networking will make it impossible for attackers to tap
Performance, capacity, agility and security make optical fiber a widely used choice for network backbones on
campuses and even within buildings, where bandwidth demands are at their highest and where there is the
greatest likelihood of electromagnetic interference from other building services, such as high-voltage power
Fiber is also increasingly attractive to network service providers for last-mile connectivity. In addition to
higher capacities and speeds, fiber easily accommodates dynamic bandwidth services, allowing service
providers to offer a portal through which their customers can determine the amount of bandwidth they need.
A client contacted us regarding slow network performance after a network speed check. It needed to transfer
large -- 0.5 GB to 1 GB -- computer-aided design files from its central storage system to remote workers, some
of whom were as far as 15 milliseconds (ms) away. Before spending a lot of money on 1 Gbps WAN links, it
wisely decided to verify the desired operation using a WAN simulator. The file storage system was connected
to the data center switch via a 10 Gbps Ethernet link. The client connection was a 1 Gbps Ethernet connection
from the switch to the WAN simulator, then via another 1 Gbps link to the client. See Figure 1.
The customer wanted to achieve 800 to 900 Mbps of delivered data, with no other significant traffic volume
competing for the link bandwidth. The WAN simulator was initially set up for 0 ms of latency. The file
transfers proceeded as expected, running at the desired throughput. However, when the anticipated 15 ms
of latency was introduced, the throughput was significantly reduced. The best throughput to the client was
about 420 Mbps and was frequently as low as 100 Mbps. Why was there such a significant difference in
We obtained packet captures at the file storage system and at the client, and we imported them
into Wireshark. Analysis on a per-packet basis was not going to be useful due to the number of packets in the
731 MB file transfer. Instead, we used the TCP sequence space graphing option of Wireshark -- select a packet
in the flow, then Wireshark Statistics >TCP Stream Graphs > tcptrace. The overall sequence space graph looks
OK.
Overall transfer
looks good, but it takes too much time
But looking closer, we find that there is a significant amount of packet loss a few seconds into the transfer.
Significant packet
loss is found after transmission begins
The transfer starts off fine, but then encounters a bunch of packet loss. It takes about three quarters of a
second for the systems to recover and resume the transfer. The rate of transfer is lower after the packet loss,
as indicated by the change in slope of the sequence number graph. Our analysis showed no further packet
loss. The transfer of 731.5 MB took 17.28 seconds, or 42.33 MBps -- 338 Mbps of user data. We expected to
see the transfer complete in about seven seconds, not 17 seconds. This was one of the better throughput tests.
The storage system started sending data and thought it had a 10 Gbps path to the client, although with 15 ms
of latency. That's a bandwidth-delay product of about 18 MB. When the buffers in the network equipment fill,
a lot of packets are dropped. It took the storage system about 700 ms to retransmit the lost data. It should
then resume transferring data with slow-start and ramp back up. Further analysis was required.
We had the storage vendor look at the system while running another test. It reported the congestion window
parameter in the storage system's TCP code was reduced to a value of 1 as a result of the packet loss. IT also
indicated the TCP stack in the storage system is based on the TCP Reno code, which is a very old
implementation. The internal congestion window gets set to 1 whenever significant packet loss occurs. The
congestion window is not transmitted as part of a packet, so it required monitoring the storage system
internals during a transfer to detect that this was happening. TCP Reno uses an additive slow-start algorithm
to ramp up the congestion window size, so the transmit window increased by one packet for every round-trip
time.
With the 731 MB file, the storage system never encountered congestion again, simply because the remaining
file-transfer time was not large enough to grow the transmit window to the point where it would cause
further packet loss. Looking at Figure 3, there is a difference in the geometric ramp-up before the packet loss
The storage system thinks it is running over a 10 Gbps path as it ramps up. The switch then drops a bunch of
packets because its internal buffers fill. The storage system's TCP stack cuts the congestion window back to
one packet. It takes about 700 ms for the storage server to recognize and retransmit the dropped packets. The
slow-start mechanism to increase the transmit window size uses the additive mechanism in which one packet
is added to the transmit window for each successful round trip -- for example, the window advances. For 700
MB files, it never reaches the point of congestive packet loss. Research on the Reno TCP stack found
In this case, the problem revealed in the network speed check was not a network problem; rather, the culprit
was the old TCP code in the storage system controller. This problem is a simple congestive overload of a low-
speed link -- the 1 Gbps link to the remote client -- by a high-speed source, or the 10 Gbps interface between
Adding buffers to the switch in the path would only delay the time at which the packet loss would occur. The
same problem can occur when several sources are congesting a single link. It is best to use quality of service
(QoS) to prioritize traffic and discard less important packets. If that's not possible, then use weighted random
early detection to begin discarding as soon as congestion starts to build, providing negative feedback to the
sources.
An interesting aspect of the problem is that transfers run at 800 Mbps if the 10 Gbps link is replaced with a 1
Gbps link. The storage system doesn't encounter significant packet loss and, therefore, doesn't reduce the
congestion window. A little packet loss occurs as the systems reach the link capacity, but not enough to cause
the storage system's Reno-based code to shut the congestion window and switch to additive slow-start.
What about workarounds? We postulated several different workarounds to the speed mismatch:
Use the 10 Gbps Ethernet pause frames to tell the storage system to delay sending;
Configure a 1 Gbps interface and use policy routing, domain name system and Network Address
Translation to force the remote client traffic over the 1 Gbps path. One of the vendors assembled its own
tests that validated what we were seeing, and it reported the pause frames and QoS didn't work. (This
result was surprising, as we were sure that policing at 1 Gbps would work. We want to obtain packet-
capture files of QoS policing tests to try to understand why it didn't work.) Our proposal to use a 1 Gbps
interface between the storage system and the switch seems to be the only viable option.
What next?
This was an interesting problem to diagnose. The customer is deciding how to proceed. We are continuing to
think about the problem and will analyze the QoS packet flows if we can get a packet-capture file.
Interestingly, we became aware of another engineering firm that had a similar problem, shortly after
delivering our analysis report. Its solution was to move its data from in-house storage systems to a cloud-
based storage vendor, with Panzura caching systems at each remote site. It was able to switch to lower-cost
internet connections with higher-bandwidth VPNs, which helped increase the throughput. We thought that
was a creative solution, but it required a change to the business processes and infrastructure, which took
An optical interconnect framework can support high-speed DCI deployments. But it's important that
optical networks become programmable to ensure reliability.
Optical networking -- and optical interconnect -- is joining the rest of the networking world in becoming more
With blazing speeds and enormous capacities, optical networks are favored in superdense situations, such as
carrier, hyperscale cloud service provider, large-enterprise data centers and especially for long-distance data
center interconnect (DCI). As optical gear evolves, and as enterprise use cases become denser and more
demanding, optical interconnect technologies will continue to increase their footprints by underpinning
Yet, in order to fit into modern data center and enterprise networks, optical networks have to become
more software-defined and programmable. Otherwise, an optical DCI -- an island of ultra-dense optical
networking equipment -- will limit the speed in which changes to a network can be fully implemented. An
optical interconnect framework also increases the likelihood it will be the source of network problems,
because it requires manual intervention, which is the leading cause for misconfiguration errors.
Fortunately, vendors are bringing optical networks into the brave new programmable world. They are
providing APIs, for example, or extending general software-defined networking (SDN) concepts of
disaggregation to optical equipment. This encompasses applying the SDN model of separating the control
plane from the data plane to optical gear, as well as extending OpenFlow -- for interplane communication -- to
accommodate optical equipment and concepts. Finally, this approach fuels the development of open source
controllers, which in turn lays the groundwork for managing virtual optical switch instances on a shared data
plane infrastructure.
By providing for virtualization and programmability, optical gear manufacturers are paving the way for
another major wave of change: multicarrier edge data centers. This technology uses software-reconfigurable
carrier-hotel operators can build a new generation of data centers that will serve as localized, distributed
New generations of the internet of things, augmented reality and virtual reality will demand real-time
responsiveness. Even optical networks and optical interconnect technologies can't circumvent the laws of
physics, and the speed of light in fiber determines how far away data processing can be and still achieve real-
time response. A 62-mile round trip takes a full millisecond of time, and each millisecond spent in transit
reduces the amount of time left for processing at the destination. By pushing some data storage and some
data processing closer to users in the form of edge data centers, the IT infrastructure will be able to provide
real-time event processing. At the same time, these shared edge data centers will require massive amounts of
Add the emergence of 5G, along with the push for greater automation in the network, and the need to drive
optical networking to become more software-defined and more flexible becomes apparent.
Cisco believes the future in switching and routing is in Luxtera silicon photonics. As a result, the
networking giant plans to buy the company for $660 million in cash.
Cisco is acquiring silicon photonics company Luxtera to stay relevant in switching and routing, as data
Cisco said this week it would buy the transceiver maker, based in Carlsbad, Calif., for $660 million in cash.
Companies plug Luxtera modules into switches and routers to turn their electronic traffic into optical signals
capable of reaching speeds of 100 Gbps over optical fiber 2 kilometers in length.
Cisco, which expects to close the acquisition by the end of April 2019, resells Luxtera silicon photonics
transceivers to its customers. But the networking giant isn't buying Luxtera for its transceiver business.
What Cisco wants is the company's expertise to build technology that moves data over light beams instead of
the slower electrons over copper wiring used on chips today. The amount of traffic within web-scale data
centers is growing so fast that Cisco sees a day when its line cardsand application-specific integrated circuits
become bottlenecks.
"That really implies an architectural shift for systems where you have to now think about coupling silicon
plus optics in order to get meaningful capacity increases," said Bill Gartner, general manager of Cisco's optical
systems group. "If we want to remain relevant in switching and routing, we believe we need to own the
silicon technology and the optics technology in order to continue advancing that capacity."
The amount of data cloud providers and telcos are coping with today is enormous, and Cisco expects it to
grow exponentially. In November, Cisco predicted annual global IP traffic would increase from 1.5 zettabytes
Traffic drivers include the increasing number of mobile devices and the growing use of video in
communications, entertainment and security. Also contributing to the growth is the ever-increasing number
Cisco believes moving maximum amounts of data center traffic will require custom silicon, which Luxtera can
help it build. Also, many web-scale data center operators use Luxtera technology with white and brite
"It's a chance for them to at least make their pitch," said Shamus McGillicuddy, an analyst at Enterprise
In the average enterprise, data center traffic is also ratcheting up, and Cisco expects 100 Gigabit Ethernet
ports to become commonplace in the next couple of years. That trend will likely lead to more sales in optical
fiber to connect server racks and the need for more technology like Luxtera transceivers.
Demand for 25/100 Gigabit Ethernet ports drove a 14% increase in port shipments in the second quarter,
according to research firm IHS Markit, based in London. Falling prices were a significant contributor to the
With port prices falling, Cisco expects optic costs to comprise a more substantial portion of overall spending
on switching, Gartner said. "We see that trend with our customers, and we believe that it's important for us to
capture both the optic costs, as well as the port costs, as we think about market share."
Redundant data centers are common, providing organizations with business continuity. But how far apart
The consultancy I work for, NetCraftsmen, recently had a customer inquire about possible performance
problems in replicating data with two data centers located about 2,000 miles apart. One-way latency was
roughly 10 milliseconds per 1,000 miles, so these data centers were recording up to 40 ms of round-trip
latency between them. Can the customer expect the data replication system to function well at these
distances? Let's look at some of the key data center interconnect technologies that influence application
performance.
First, a little history: Top WAN speeds just a few years ago were 100 Mbps; it is now common to see 1 Gbps
WAN links, with some carriers offering 10 Gbps paths. Conceptually, it seems if you need better application
performance, increasing the WAN speed by a factor of 10 -- 100 Mbps to 1 Gbps or from 1 Gbps to 10 Gbps --
would result in a tenfold increase in application performance. But that rarely occurs. Other factors limit the
performance improvement.
Latency,
congestion can plague DCI strategies
One consideration among data center interconnect technologies is the amount of data that is buffered in a
long-delay network pipe. I will skip the detailed explanation of how TCP works, as there are many good
explanations available. The brief summary is TCP throughput is limited by the round-trip latency and the
One way to measure TCP performance is the bandwidth delay product (BDP), which gauges how much data
TCP should have at one time to fully utilize the available channel capacity. It's the product of the link speed
times the round-trip latency. In our example above, the BDP is about 5 MB -- 0.04 seconds x 1 billion bps/8
Both the sending and receiving systems must buffer the 5 MB of data required to fill the pipe. At full speed, the
systems must transfer 5 MB every 40 ms. Older operating systems that used fixed buffers had to be
configured to buffer the desired amount of data for optimum performance over big BDP paths. These older
operating systems often had default buffers of 64 KB, which resulted in a maximum throughput of about 1.6
MBps over the 2,000-mile path. Fortunately, more modern operating systems automatically adjust their
One way to increase throughput is to reduce the round-trip latency. However, there have been a number
of research papers that describe system architectures where distributed data storage can be nearly as
effective as local storage. This is quite different than the prior thinking about locating data near the systems
Taking a look at some of the technical obstacles of data center interconnect technologies
Of course, other factors associated with data center interconnect technologies may come into play. Among
them:
Latency. Low latency is important for applications that use small transactions. The round-trip times are more
critical than the overall time as the number of packet exchanges increases. An application that relies on
hundreds of small transactions to perform a single user action would exhibit good performance in a LAN
environment, with latencies between 1 ms and 2 ms. However, performance would degrade in an
environment where those actions are run over a 50 ms round-trip path, and transaction time might take five
to 10 seconds.
Conceptually, it seems if you need better application performance, increasing the WAN speed by a factor of 10
would result in a tenfold increase in application performance. But that rarely occurs.
It is easy to overlook latency, particularly where the application comes from an external party and you don't
know its internal operation. I recommend validating an application in a high-latency WAN environment or
with a WAN simulator before you plan to migrate processing from a LAN to WAN environment.
Shared capacity and congestion. Most WAN carriers provide an IP link with the assumption you won't be
using all the bandwidth all the time. This allows them to multiplex multiple customers over a common
infrastructure, providing more competitive pricing than if it were a dedicated circuit. However, this means
there may be times when the traffic from multiple customers causes congestion at points along the path
between two data centers. Some packets must get delayed or dropped when the congestion is greater than
need to implement quality-of-service configurations to shape and police traffic to drop excess traffic before it
This problem becomes greater when the path is over the internet. There are clear internet congestion effects
coinciding with peak usage that occurs locally at lunchtime, after school and in the evening. Data paths
spanning multiple time zones may experience longer congestion times as the peak usage migrates from one
SD-WAN may be able to help with overall throughput for some applications. These products can measure the
latency, bandwidth and packet loss of multiple links and allow you to direct different types of traffic over
links with specific characteristics. For example, important application traffic can take a reliable MPLS path,
TCP congestion avoidance. A significant factor in TCP performance is the vintage of the TCP/IP stack,
particularly the congestion control algorithm. Congestion can occur in network devices where link speeds
change and at aggregation points. For example, we once had a customer with a speed mismatch between its
10 Gb LAN connection and a 1 Gb WAN link. More modern TCP/IP stacks would have handled the congestion
It is good to understand how key applications will use the network and how additional latency will affect
their performance. In some cases, an alternative deployment design may be needed. This means network
engineers need to be familiar with all of the concepts noted above and, more importantly, be able to explain
these issues to others who may not be familiar with the application performance issues.
and wireless networks. Arista is a client of Kerravala’s ZK Research, a consulting firm based in Westminster,
Mass.
Currently planned for release in 2019, the 802.11ax standard offers 10 Gbps speeds –up to 40% faster than
Wave 2 802.11ac. Find out how this will be a game changer, and why anyone with skin in the game should
Anyone who has used a wireless device has likely experienced a scenario where the device was connected to
the access point but no network services worked. Or perhaps the device was connected, got booted off, and
the user couldn’t re-establish connectivity. These problems have been around as long as Wi-Fi and can affect
In the past, Wi-Fi flakiness was annoying, but it wasn’t business-critical because wireless was considered a
network of convenience. Today, however, that has changed. Many workers need Wi-Fi to do their jobs
Also, Wi-Fi-connected IoT devices have proliferated. Consequently, wireless network outages or performance
Network administrators have a hard time troubleshooting Wi-Fi problems. A recent ZK Research survey
found many network engineers spend about 20% of their time troubleshooting Wi-Fi issues. Often the
problem disappears before it’s fixed. But the root cause is still there, and the issue will likely re-emerge.
The Wi-Fi network is now mission-critical and arguably as important as the data center network.
Networking vendor Arista Networks, based in Santa Clara, Calif., is looking to address Wi-Fi issues. The
company announced this week its Cognitive Campus architecture — a suite of tools that unifies wired and
wireless networks by applying a software-driven approach to the campus. To date, Arista has found most of
Cognitive Campus sheds some light on Arista’s planned acquisition of Mojo Networks. Earlier this year, Arista
said it would acquire Mojo, a company that sells its products at the campus edge, signaling it wants to be a
Arista has other campus products, but they’re targeted at the campus core where the requirements are
similar to the data center. As a result, Mojo is Arista’s first true campus edge offering.
Specifically, Arista is looking to use Mojo’s Cognitive WiFi to remove traditional bottlenecks created by Wi-Fi
controllers. Traditional Wi-Fi products have focused on ensuring connectivity rather than understanding
into network performance so network engineers can identify the source of a Wi-Fi problem before it affects
business. Arista has integrated the wireless edge information into CloudVision.
Mojo’s management model disaggregated the control and data planes so its cloud controller only handles
management and configuration updates. If the access points (APs) lost the connection to the controller, the
network would continue to operate. Most other APs would stop working if controller connectivity was lost.
As part of Cognitive Campus, Arista can aggregate data from the wired network and combine it with wireless
Arista’s planned acquisition of Mojo left some industry observers puzzled. On the surface, a data center and
However, the intersection of the two spawns a treasure trove of data. As a result, analytics of the
information can be used to transform the network. Arista’s Cognitive software brings some visibility and
Network professionals should rethink network operations and embrace the analytics and automation
For the past five years, my advice to engineers has been: If you’re doing something today that’s not strategic
to your company or resume, don’t do it, and find a way to automate it. Wireless connectivity and performance
I’ve never heard of engineers getting hired because they were really good at solving problems that shouldn’t
happen in the first place. Focus on software skills, data analytics and architecture, and understanding the user