Sie sind auf Seite 1von 25

Server-to-network edge technologies:

converged networks and virtual I/O

Technology brief

Table of contents
Executive Summary .............................................................................................................................. 2
What is the server-to-network edge?....................................................................................................... 3
Physical infrastructure challenges at the server-to-network edge ................................................................. 5
Solving server-to-network complexity with converged networks .................................................................. 6
iSCSI .............................................................................................................................................. 6
Fibre Channel over Ethernet .............................................................................................................. 8
Challenges at the server-to-network edge from virtual machines ............................................................... 11
Virtual Ethernet Bridges................................................................................................................... 11
Solving complexity at the virtualized server-to-network edge with Edge Virtual Bridging ............................. 15
Common EVB capabilities and benefits............................................................................................. 15
Virtual Ethernet Port Aggregator (VEPA) ............................................................................................ 17
Multichannel technology ................................................................................................................. 18
VN-Tag......................................................................................................................................... 19
Status of EVB and related standards ................................................................................................. 20
Summary .......................................................................................................................................... 20
Appendix: Understanding SR-IOV and Virtual Connect Flex-10 ............................................................... 23
For more information.......................................................................................................................... 25
Call to action .................................................................................................................................... 25
Executive Summary
This technology brief describes the “server-to-network edge” in a data center architecture and
technologies that affect the server-to-network edge in both the physical and virtual infrastructure.
The server-to-network edge is the link between the server and the first tier of the network infrastructure,
and it is becoming an increasingly important area of the infrastructure. Within the data center, the
server-to-network edge is the point within the topology that contains the most connections, switches,
and ports. The most common types of networks used in enterprise data centers are Ethernet for LANs
and Fibre Channel for SANs. Typically, these have different topologies, administrators, security,
performance requirements, and management structures.
Converged networking technologies such as iSCSI, Fibre Channel over Ethernet (FCoE), and
Converged Enhanced Ethernet (CEE) have the potential to simplify the networking infrastructure. CEE
is the Ethernet implementation of the Data Center Bridging (DCB) protocols that are being developed
by IEEE for any 802 MAC layer network. CEE and DCB are commonly used interchangeably. For this
technology brief, the term CEE refers to Ethernet protocols and Ethernet products that are DCB
compliant. With 10 Gb Ethernet becoming more common, the potential for combining multiple
network streams over a single fabric becomes more feasible. HP recommends that customers
implement converged networking first at the server-to-network edge. This area contains the most
complexity and thus offers the most opportunity for cost benefit by simplifying and reducing the
amount of infrastructure. By implementing first at the server-to-network edge, customers will also be
able to maintain existing administrator roles, network topologies, and security requirements, for their
LAN and SAN infrastructure. Customers who want to implement FCoE as a converged networking
fabric should be aware that some of the underlying standards are not finalized. By implementing first
at the server edge, customers will be able to take advantage of the standards that are in place while
avoiding the risk of implementing technology that will have to be replaced or modified as standards
The server-to-network edge is also becoming increasingly complex due to the sprawl of virtual
machines. Virtual machines add a new layer of virtual machine software and virtual networking that
dramatically impacts the associated network. Challenges from virtual machines include the
performance loss and management complexity of integrating software-based virtual switches, also
referred to as Virtual Ethernet Bridges (VEBs), into existing network management. In the near future,
hardware NICs will be available that use Single Root I/O Virtualization (SR-IOV) technology to move
software-based virtual switch functionality into hardware.
While hardware-based SR-IOV NICs will improve performance, compared to traditional software-only
virtual NICs, they do not solve the other challenges of management integration and limited
management visibility into network traffic flows. To solve these challenges, the IEEE 802.1 Work
Group is creating a standard called Edge Virtual Bridging (EVB). The EVB standard is based on a
technology known as Virtual Ethernet Port Aggregator (VEPA). Using VEPA, VM traffic within a
physical server is always forwarded to an external switch and directed back when necessary to the
same host. This process is known as a “hair-pin turn”, as the traffic does a 180-degree turn. The hair-
pin turn provides for external switch visibility into and policy control over the virtual machine traffic
flows and it can be enabled in most existing switches with firmware upgrades and no hardware
The HP approach to technologies at the server-to-network edge is based on using industry standards.
This ensures that new technologies will work within existing customer environments and
organizational roles, yet will preserve customer choice. The HP goal for customers is to enable a
simple migration to advanced technologies at the server-to-network edge without requiring an entire
overhaul strategy for the data center.

This paper assumes that the reader is relatively familiar with Ethernet and Fibre Channel networks,
and with existing HP ProLiant offerings such as BladeSystem and Virtual Connect Readers can see the
“For more information” section and the Appendix for additional information about HP products.

What is the server-to-network edge?

In this technology brief, the server-to-network edge refers to the connection points between servers and
the first layer of both local area network (LAN) and storage area network (SAN) switches. The most
common networks used in enterprise data centers are Ethernet for LANs and Fibre Channel for SANs.
Figure 1 illustrates a typical multi-tiered Ethernet infrastructure. Ethernet network architecture is often
highly tree-like, can be over-subscribed, and relies on a structure in this manner:
• Centralized (core) switches at the top layer
• Aggregation (distribution) switches in the middle layer
• Access switches at the bottom, or server-to-network edge layer

Figure 1. Typical multi-tier Ethernet data center architecture (physical infrastructure)

The dotted line around the Blade server/Top of rack(ToR) switch indicates an optional layer, depending on
whether the interconnect modules replace the ToR or add a tier.

Figure 1 illustrates an architecture using both rack-based servers (such as HP ProLiant DL servers) and
blade enclosures (such as HP BladeSystem c-Class enclosures) along with a variety of switches such as
top-of-rack (ToR), end-of-row (EoR), distribution, and core switches:
• A single rack of servers may connect to top-of-rack switches (typically deployed in pairs for
redundancy) that uplink to a redundant aggregation network layer.
• Several racks of servers may connect to end-of-row switches that consolidate the network
connections at the edge before going into the aggregation network layer.
• Blade-based architectures may include server-to-network interconnect modules that are embedded
within the enclosure. If the interconnect modules are switches, they may either replace the existing
access switch or may add an extra tier. If the interconnect module is a pass-through, the
infrastructure usually maintains the ToR switch.

Figure 2 illustrates the two most common SAN architectures:

• SAN core architecture, also referred to as a SAN island, in which many servers connect directly to
a large SAN director switch
• Edge-core architecture, which uses SAN fabric or director switches before connecting to the SAN
director core switch

Figure 2. Typical single or two-layer architecture of a Fibre Channel SAN.

SAN island Edge-Core

The dotted line around the Blade server/FC switch indicates an optional layer, depending on whether the blade
enclosure uses interconnects that are embedded switches or pass-through modules

Customers who want to maintain a SAN island approach with blade enclosures can use Fibre
Channel pass-through modules to connect directly to the director switch.
With an edge-core SAN architecture, SAN designers that have blade enclosures can use one of two
• Use pass-through modules in the enclosure while maintaining an external edge switch
• Replace an existing external Fibre Channel edge switch with an embedded blade switch

Typically, LAN and SAN architectures have different topologies: LAN architectures tend to have multi-
layered, tree-like topologies while SAN architectures tend to remain flat. Often, a SAN architecture
differs from a LAN architecture in oversubscription ratios/cross-sectional bandwidth: a SAN is
typically lightly oversubscribed by 2:1 to 8:1 (at the server edge), while a LAN architecture may be
heavily oversubscribed 1 by 10:1 or 20:1. Different topologies are one of the reasons that HP is
focusing on the server-to-network edge. Administrators should be allowed to maintain similar
processes, procedures, data center structures, and organizational governance roles while reaping the
benefits of reduced infrastructure at the server-to-network edge.

Physical infrastructure challenges at the server-to-network

As illustrated in Figures 1 and 2, the server edge is the most physically complex part of the data
center networks; the server edge has the most connections, switches, and ports (especially in
environments using only rack-based servers). For most enterprise data centers that use Fibre Channel
SANs and Ethernet LANs, each network type has its own requirements:
• Unique network adapters for servers
• Unique switches
• Network management systems designed for each network
• Different organizations to manage the networks
• Unique security requirements
• Different topologies
• Different performance requirements

Each of these differences add complexity and cost to the data center, especially for enterprise data
centers that have large installed Fibre Channel and Ethernet networks.
For some customers that can use non-Fibre Channel storage, iSCSI solves these challenges without
requiring new Ethernet infrastructure. However, iSCSI technology is still evolving and has experienced
a slow rate of adoption, especially for enterprise data centers. The iSCSI section included later in the
paper discusses iSCSI in more detail.

Non-oversubscribed LAN topologies with large cross-sectional bandwidths are certainly possible, just not as common.

Solving server-to-network complexity with converged
Data center administrators can address complexity at the server-to-network edge by consolidating
server I/O in various ways:
• Combining multiple lower-bandwidth connections into a single, higher-bandwidth connection
• Converging (combining) different network types (LAN and SAN) to reduce the amount of physical
infrastructure required at the server edge

Administrators can already use HP Virtual Connect Flex-10 technology to aggregate multiple
independent server traffic streams into a single 10Gb Ethernet (10GbE) connection. For instance,
administrators can partition a 10GbE connection and replace multiple lower bandwidth physical NIC
ports with a single Flex-10 port. This reduces management requirements, reduces the amount of
physical infrastructure (number of NICs and number of interconnect modules), simplifies the amount of
cross-connect in the network, and reduces power and operational costs.
The more common definition of converged networks refers to combining LAN and SAN protocols onto
a single fabric. The possibility of converging, or combining, data center networks has been proposed
for some time. InfiniBand, iSCSI, and other protocols have been introduced with the promise of
providing efficient transport on a single fabric for multiple traffic classes and data streams. One of the
goals of a converged network is to simplify and flatten the typical Ethernet topology by reducing the
number of physical components, leading to other opportunities to simplify management and improve
quality of service (QoS). The benefits of a converged network include reduction of host/server
adapters, cabling, switch ports, and costs; simply stated, it enables simpler, more centralized
Two technologies provide the most promise for use on a converged network: iSCSI and Fibre Channel
over Ethernet (FCoE).

One of the reasons that the Fibre Channel protocol became prevalent is that it is a very lightweight,
high-performance protocol that maps locally to the SCSI initiators and targets (Figure 3, far left).
But Fibre Channel protocols are not routable and can be used only within a relatively limited area,
such as within a single data center.
The iSCSI protocol sought to improve that by moving the SCSI packets along a typical Ethernet
network using TCP/IP (Figure 3, far right). The iSCSI protocol serves the same purpose as Fibre
Channel in building SANs, but iSCSI runs over the existing Ethernet infrastructure and thus avoids the
cost, additional complexity, and compatibility issues associated with Fibre Channel SANs.

Figure 3. Comparison of storage protocol stacks

Operating Systems and Applications

SCSI Layer



FC CEE Ethernet

An iSCSI SAN is typically comprised of software or hardware initiators on the host server connected
to an Ethernet network and some number of storage resources (targets). The iSCSI stack at both ends
of the path is used to encapsulate SCSI block commands into Ethernet Packets for transmission over IP
networks (Figure 4). Initiators include both software- and hardware-based initiators incorporated on
host bus adapters (HBAs) and NICs.

Figure 4. iSCSI is SCSI over TCP/IP

The first generations of iSCSI had some limitations that minimized its adoption:
• Traditional software-based iSCSI initiators require the server CPU to process the TCP/IP protocols
added onto the SCSI stack.
• iSCSI did not have a storage management system equivalent to the Fibre Channel SANs.
• Configuring iSCSI boot was a manual and cumbersome process; therefore, it was not scalable to
large numbers of servers.

The newer generations of iSCSi technology solve these issues:

• High-performance adapters are now available that fully offload the protocol management to a
hardware-based iSCSI HBA or NIC. This is called full iSCSI offload.
• Using 10Gb networks and 10Gb NICs with iSCSI SANs generates performance comparable to a
Fibre Channel SAN operating at 8 Gb.
• New centralized iSCSI boot management tools provide mechanisms for greater scalability when
deploying large numbers of servers

With 10Gb-based iSCSI products, iSCSI becomes a more viable solution for converged networks for
small-medium businesses as well as enterprise data centers. For customers with iSCSI-based storage
targets and initiators, iSCSI can be incorporated today with their existing Ethernet infrastructure.
Unlike FCoE, iSCSI does not require the new Converged Enhanced Ethernet (CEE) infrastructure (see
the Fibre Channel over Ethernet section for more information). But if present, iSCSI will benefit from
the QoS and bandwidth management offered by CEE. Because iSCSI administration requires the
same skill set as a TCP/IP network, using an iSCSI network minimizes the SAN administration skills
required, making iSCSI a good choice for green-field deployments or when there are smaller IT teams
and limited budgets. Using iSCSI storage can also play a role in lower, non-business critical tiers in
enterprise data centers.

Fibre Channel over Ethernet

FCoE takes advantage of 10Gb Ethernet’s performance while maintaining compatibility with existing
Fibre Channel protocols. Typical legacy Ethernet networks allow Ethernet frames to be dropped,
typically under congestion situations, and they rely on upper layer protocols such as TCP to provide
end-to-end data recovery. 2 Because FCoE is a lightweight encapsulation protocol and lacks the
reliable data transport of TCP layer, it must operate on Converged Enhanced Ethernet (CEE) to
eliminate Ethernet frame loss under congestion conditions.
Because FCoE was designed with minimal changes to the Fibre Channel protocol, FCoE is a layer 2
(non-routable) protocol just like Fibre Channel, and can only be used for short-haul communication
within a data center. FCoE encapsulates Fibre Channel frames inside of Ethernet frames (Figure 5).

It is possible to create a lossless Ethernet network using existing 802.3x mechanisms. However if the network is carrying
multiple traffic classes, the existing mechanisms can cause QoS issues, limit the ability to scale a network, and impact

Figure 5. Illustration of an FCoE packet

The traditional data center model uses multiple HBAs and NICs in each server to communicate with
various networks. In a converged network, the converged network adapters (CNAs) can be deployed
in servers to handle both FC and CEE traffic, replacing a significant amount of the NIC, HBA, and
cable infrastructure (Figure 6).

Figure 6. Converged network adapter (CNA) architecture

FCoE uses a gateway device (an Ethernet switch with CEE, legacy Ethernet, and legacy FC ports) to
pass the encapsulated Fibre Channel frames between the server’s Converged Network Adapter and
the Fibre Channel-attached storage targets.

There are several advantages to FCoE:
• FCoE uses existing OS device drivers.
• FCoE uses the existing Fibre Channel security and management model with minor extensions for the
FCoE gateway and Ethernet attributes used by FCoE.
• Storage targets that are provisioned and managed on a native FC SAN can be accessed
transparently through an FCoE gateway.

However, there are also some challenges with FCoE:

• Must be deployed using a CEE network
• Requires converged network adapters and new Ethernet hardware between the servers and storage
targets (to accommodate CEE)
• Is a non-routable protocol and can only be used within the data center
• Requires an FCoE gateway device to connect the CEE network to the legacy Fibre Channel SANs
and storage
• Requires validation of a new fabric infrastructure that includes both the Ethernet and Fibre Channel

Converged Enhanced Ethernet (CEE)

An informal consortium of network vendors originally defined a set of enhancements to Ethernet to
provide enhanced traffic management and lossless operations. This consortium originally coined the
term Converged Enhanced Ethernet to describe this new technology. These proposals have now
become a suite of proposed standards from the Data Center Bridging (DCB) task group within the
IEEE 802.1 Work Group. The IEEE is defining the DCB standards so that they could apply to any IEEE
802 MAC layer network type, not just Ethernet. So administrators can consider CEE as being the
application of the DCB draft standards to the Ethernet protocol. Quite commonly, the terms DCB and
CEE are used interchangeably. For this technology brief, the term CEE refers to Ethernet protocols and
Ethernet products that are DCB compliant.
There are four new technologies defined in the DCB draft standards:
• Priority-based Flow Control (PFC), 802.1Qbb – Ethernet flow control that discriminates between
traffic types (such as LAN and SAN traffic) and allows the network to selectively pause different
traffic classes
• Enhanced Transmission Selection (ETS), 802.1Qaz – formalizes the scheduling behavior of multiple
traffic classes, including strict priority and minimum guaranteed bandwidth capabilities. This formal
transmission handling should enable fair sharing of the link, better performance and metering.
• Quantized Congestion Notification (QCN), 802.1Qau – supports end-to-end flow control in a
switched LAN infrastructure and helps eliminate sustained, heavy congestion in an Ethernet fabric.
QCN must be implemented in all components in the CEE data path (CNAs, switches, and so on)
before the network can use QCN. QCN must be used in conjunction with PFC to completely avoid
dropping packets and guarantee a lossless environment.
• Data Center Bridging Exchange Protocol (DCBX), 802.1Qaz – supports discovery and
configuration of network devices that support the technologies described above (PFC, ETS, and

While three of the standards that define the protocols are nearing completion, the QCN standard is
the most complex, least understood, and requires the most hardware support. The QCN standard will
be critical in allowing IT architects to converge networks “deeper” into the distribution and core
network layers. Also, it is important to note that QCN will require a hardware upgrade of almost all
existing CEE components before it can be implemented in data centers.

Transition to FCoE
HP advises that the transition to FCoE can be a graceful implementation with little disruption to
existing network infrastructures if it is deployed first at the server-to-network edge and migrated further
into the network over time. As noted in the previous section, the lack of fully finalized CEE standards
is another reason HP views the server-to-network edge as the best opportunity for taking advantage of
“converged network” benefits.
Administrators could also start by implementing FCoE only with those servers requiring access to FC
SAN targets. Not all servers need access to FC SANs. In general, more of a data center’s assets use
only LAN attach than use both LAN and SAN. CNAs should be used only with the servers that
actually benefit from it, rather than needlessly changing the entire infrastructure.
If administrators transition the server-to-network edge first to accommodate FCoE/CEE, this will
maintain the existing architecture structure and management roles, retaining the existing SAN and
LAN topologies. Updating the server-to-network edge offers the greatest benefit and simplification
without disrupting the data center’s architectural paradigm.

Challenges at the server-to-network edge from virtual

While converged network technology addresses the complexity issues of proliferating physical server-
to-network edge infrastructure, a different, but complementary, technology is required to solve the
complexity of the server-to-network edge resulting from the growing use of virtual machines. This new
layer of virtual machines and virtual switches within each “virtualized” physical server introduces new
complexity at the server edge and dramatically impacts the associated network.
Challenges from virtual machines include the issues of managing virtual machine sprawl (and the
associated virtual networking) and performance loss and management complexity of integrating
software-based virtual switches with existing network management. These are significant challenges
that are not fully addressed by any vendor today. HP is working with other industry leaders to
develop standards that will simplify and solve these challenges.
The following sections explain the types of virtualized infrastructure available today (or in the near
future), the terms used to describe these infrastructure components, and the related standardization

Virtual Ethernet Bridges

The term Virtual Ethernet Bridge (VEB) is used by industry groups to describe network switches that are
implemented within a virtualized server environment and support communication between virtual
machines, the hypervisor, and external network switches. 3 In other words, a VEB is a virtual Ethernet
switch. It can be an internal, private, virtual network between virtual machines (VMs) within a single
physical server, or it can be used to connect VMs to the external network.
Today, the most common implementations of VEBs are software-based virtual switches, or
“vSwitches,” that are incorporated into all modern hypervisors. However, in the near future, VEB
implementations will include hardware-based switch functionality built into NICs that implement the
PCI Single Root I/O Virtualization (SR-IOV) standard. Hardware-based switch functionality will
improve performance compared to software-based VEBs.

The term “bridge” is used because IEEE 802.1 standards use the generic term “bridge” to describe what are commonly know as switches. A
more formal definition: A Virtual Ethernet Bridge (VEB) is a frame relay service within a physical end station that supports local bridging
between multiple virtual end stations and (optionally) the external bridging environment.

Software-based VEBs – Virtual Switches
In a virtualized server, the hypervisor abstracts and shares physical NICs among multiple virtual
machines, creating virtual NICs for each virtual machine. For the vSwitch, the physical NIC acts as
the uplink to external network. The hypervisor implements one or more software-based virtual switches
that connect the virtual NICs to the physical NICs.
Data traffic received by a physical NIC is passed to a virtual switch that uses hypervisor-based
VM/Virtual NIC configuration information to forward traffic to the correct VMs.
When a virtual machine transmits traffic from its virtual NIC, a virtual switch uses its hypervisor-based
VM/virtual NIC configuration information to forward the traffic in one of two ways (see Figure 7):
• If the destination is external to the physical server or to a different vSwitch, the virtual switch
forwards traffic to the physical NIC.
• If the destination is internal to the physical server on the same virtual switch, the virtual switch uses
its hypervisor-based VM/virtual NIC configuration information to forward the traffic directly back to
another virtual machine.

Figure 7. Data flow through a Virtual Ethernet Bridge (VEB) implemented as a software-based virtual switch

Using software-based virtual switches offers a number of advantages:

• Good performance between VMs – A software-based virtual switch typically uses only layer 2
switching and can forward internal VM-to-VM traffic directly, with bandwidth limited only by
available CPU cycles, memory bus bandwidth, or user/hypervisor-configured bandwidth limits.
• Can be deployed without an external switch – Administrators can configure an internal network
with no external connectivity, for example, to run a local network between a web server and a
firewall running on separate VMs within the same physical server.
• Support a wide variety of external network environments – Software-based virtual switches are
standards compliant and can work with any external network infrastructure.

Conversely, software-based virtual switches also experience several disadvantages:
• Consume valuable CPU bandwidth – The higher the traffic load, the greater the number of CPU
cycles required to move traffic through a software-based virtual switch, reducing the ability to
support larger numbers of VMs in a physical server. This is especially true for traffic that flows
between VMs and the external network.
• Lack network-based visibility – Virtual switches lack the standard monitoring capabilities such as
flow analysis, advanced statistics, and remote diagnostics of external network switches. When
network outages or problems occur, identifying the root cause of problems can be difficult in a
virtual machine environment. In addition, VM-to-VM traffic within a physical server is not exposed to
the external network, with the management of virtual systems not being integrated into the
management system for the external network, and again, makes problem resolution difficult.
• Lack network policy enforcement – Modern external switches have many advanced features that are
being used in the data center: port security, quality of service (QoS), access control lists (ACL), and
so on. Software-based virtual switches often do not have, or have limited support for, these
advanced features. Even if virtual switches support some of these features, their management and
configuration is often inconsistent or incompatible with the features enabled on external networks.
This limits the ability to create end-to-end network policies within a data center.
• Lack management scalability – As the number of virtualized servers begins to dramatically expand
in a data center, the number of software-based virtual switches experiences the same dramatic
expansion. Standard virtual switches must each be managed separately. A new technology called
“distributed virtual switches” was recently introduced that allows up 64 virtual switches to be
managed as a single device, but this only alleviates the symptoms of the management scalability
problem without solving the problem of lack of management visibility outside of the virtualized

Hardware VEBs — SR-IOV enabled NICs

Moving VEB functionality into the NIC hardware improves the performance issues associated with
software-based virtual switches. Single-Root I/O virtualization (SR-IOV) is the technology that allows a
VEB to be deployed into the NIC hardware.
The PCI Special Interest Group (SIG) has developed SR-IOV and Multi-Root I/O Virtualization
standards that allow multiple VMs running within one or more physical servers to natively share and
directly access and control PCIe devices (commonly called “direct I/O”). The SR-IOV standard
provides native I/O virtualization for PCIe devices that are shared on a single physical server. (Multi-
Root IO Virtualization refers to direct sharing of PCI devices between guest operating systems on
multiple servers.)
SR-IOV supports one or more physical functions (full-feature PCI functions) and one or more virtual
functions (light-weight PCI functions focused primarily on data movement). A capability known as
Alternative Route Identifiers (ARI) allows expansion of up to 256 PCIe physical or virtual functions
within a single physical PCIe device.
NICs that implement SR-IOV allow the VM’s virtual NICs to bypass the hypervisor vSwitch by
exposing the PCIe virtual (NIC) functions directly to the guest OS. By exposing the registers directly to
the VM, the NIC reduces latency significantly from the VM to the external port. The hypervisor
continues to allocate resources and handle exception conditions, but it is no longer required to
perform routine data processing for traffic between the virtual machines and the NIC. Figure 8
illustrates the architecture of an SR-IOV enabled NIC.

Figure 8. Virtual Ethernet Bridge (VEB) implemented as an SR-IOV enabled NIC

There are benefits to deploying VEBs as hardware-based SR-IOV enabled NICs:

• Reduces CPU utilization relative to software-based virtual switch implementations. With direct I/O,
soft virtual switches are no longer part of the data path.
• Increases network performance due to the direct I/O between a guest OS and the NIC. This is
especially true for traffic flowing between virtual machines and the external networks.
• Supports up to 256 functions per NIC, significantly increasing the number of virtual networking
functions for a single physical server.

While SR-IOV brings improvements over traditional software-only virtual NICs, there are still
challenges with hardware-based SR-IOV NICs:
• Lack of network-based visibility – SR-IOV NICs with direct I/O do not solve the network visibility
problem. In fact, because of limited resources of cost effective NIC silicon, they often have even
fewer capabilities than software-based virtual switches.
• Lack of network policy enforcement – Common data flow patterns to software-based virtual switches
and limited silicon resources in cost effective SR-IOV NICs results in no advanced policy
enforcement features.
• Increased flooding onto external networks – SR-IOV NICs often have small address tables and do
not “learn,” so they may increase the amount of flooding, especially if there is a significant amount
of VM-to-VM traffic.
• Lack of management scalability – There are still a large number of VEBs (one per NIC port typically)
that will need to be managed independently from the external network infrastructure. Furthermore,
these devices are limited because they typically have one VEB per port; whereas, software-based
virtual switches can operate multiple NICs and NIC ports per VEB. Thus, the proliferation of VEBs to
manage is even worse with VEBs implemented in SR-IOV enabled NICs.
• SR-IOV requires a guest OS to have a paravirtualized driver to support the direct I/O with the PCI
virtual functions and operating system support for SR-IOV, which is not available in market-leading
hypervisors as of this writing.

In summary, to obtain higher performance, the hypervisor-based software virtual switch is moving into
hardware on the NIC using SR-IOV technology. Using NICs enabled with SR-IOV solves many of the
performance issues, but does not solve the issues of management integration and limited visibility into
network traffic flows. 4 Next generation SR-IOV implementations that use Edge Virtual Bridging will
provide a better solution for the problem of management visibility, as described in the following

Solving complexity at the virtualized server-to-network edge

with Edge Virtual Bridging
Realizing the common challenges with both software-based and hardware-based VEBs, the IEEE
802.1 Work Group is creating a new set of standards called Edge Virtual Bridging (EVB), IEEE
802.1Qbg. These standards aim to resolve issues with network visibility, end-to-end policy
enforcement, and management scalability at the virtualized server-to-network edge. EVB changes the
way virtual bridging or switching is done to take advantage of the advanced features and scalable
management of external LAN switch devices. 5
Two primary candidates have been forwarded as EVB proposals:
• Virtual Ethernet Port Aggregator (VEPA) and Multichannel technology, created by an industry
coalition led by HP
• VN-Tag technology, created by Cisco

Both of these proposals solve similar problems and have many common characteristics, but the
implementation details and associated impact to data center infrastructure are quite different between
the two.

Common EVB capabilities and benefits

The primary goals of EVB are to solve the network visibility, policy management, and management
scalability issues faced by virtualized servers at the server-to-network edge.
EVB augments the methods used by traditional VEBs to forward and process traffic so that the most
advanced packet processing occurs in the external switches. Just like a VEB, an EVB can be
implemented as a software-based virtual switch, or in hardware-based SR-IOV NIC devices. Software-
or hardware-based EVB operations can be optimized to make them simple and inexpensive to
implement. At the same time, the virtual NIC and virtual switch information is exposed to the external
switches to achieve better visibility and manageability.
With an EVB, traffic from a VM to an external destination is handled just like in a VEB (Figure 9, top
traffic flow). Traffic between VMs within a virtualized server travels to the external switch and back
through a 180-degree turn, or “hairpin turn” (Figure 9, bottom traffic flow). Traffic is not sent directly
between the virtual machines, as it would be with a VEB.
This capability provides greater management visibility, control, and scalability. However, there is a
performance cost associated with EVB because the traffic is moved deeper into the network, and the
VM-to-VM traffic uses twice the bandwidth for a single packet – one transmission for outgoing and
one for incoming traffic. EVB is not necessarily a replacement for VEB modes of operation, but will be
offered as an alternative for applications and deployments that require the management benefits.

SR-IOV is an important technology for the future. For existing servers, HP has implemented a similar method for aggregating
virtual I/O with the HP Flex10 technology. For more information about how they differ, see the Appendix.
EVB is defined as “the environment where physical end stations, containing multiple virtual end stations, all require the
services of adjacent bridges forming a local area network.” See the EVB tutorial on the 802.1 website at

Thus, IT architects will have the choice — VEB or EVB — to match the needs of their application
requirements with the design of their network infrastructure.

Figure 9. Basic architecture and traffic flow of an Edge Virtual Bridge (EVB)

Most switches are not designed to forward traffic received on one port back to the same port (doing
so breaks normal spanning tree rules). Therefore, external switches must be modified to allow the
hairpin turn if a port is attached to an EVB-enabled server. Most switches already support the hairpin
turn using a mechanism similar to that used for wireless access points (WAP). But rather than using the
WAP negotiation mode, an EVB-specific negotiation mode will need to be added into the switch
firmware. Importantly, no ASIC development or other new hardware is required for a network switch
to support these hairpin turns. The hairpin turn allows an external switch to apply all of its advanced
features to the VM-to-VM traffic flows, fully enforcing network policies.
Another change required by EVBs is the way in which multicast or broadcast traffic received from the
external network is handled. Because traffic received by the EVB from the physical NIC can come
from the hairpin turn within the external switch (that is, it was sourced from a VM within a virtualized
server), traffic must be filtered to forbid a VM from receiving multicast or broadcast packets that it sent
onto the network. This would violate standard NIC and driver operation in many operating systems.
Using EVB-enabled ports addresses the goals of network visibility and management scalability. The
external switch can detect an EVB-enabled port on a server and can detect the exchange of
management information between the MAC addresses and VMs being deployed in the network. This
brings management visibility and control to the VM level rather than just to the physical NIC level as
with today’s networks. Managing the connectivity to numerous virtual servers from the external switch
and network eases the issues with management scalability.
There are many benefits to using an EVB:
• Improves virtualized network I/O performance and eliminates the need for complex features in
software-based virtual switches
• Allows the adjacent switch to incorporate the advanced management functions, thereby allowing
the NIC to maintain low-cost circuitry
• Enables a consistent level of network policy enforcement by routing all network traffic through the
adjacent switch with its more complete policy-enforcement capabilities

• Provides visibility of VM-to-VM traffic to network management tools designed for physical adjacent
• Enables administrators to gain visibility and access to external switch features from the guest OS
such as packet processing (access control lists) and security features such as Dynamic Host
Configuration Protocol guard, address resolution protocol (ARP), ARP monitoring, source port
filtering, and dynamic ARP protection and inspection
• Allows administrators to configure and manage ports in the same manner for VEB and VEPA ports
• Enhances monitoring capabilities, including access to statistics like NetFlow, sFlow, rmon, port
mirroring, and so on

The disadvantage of EVB is that VM-to-VM traffic must flow to the external switch and then back
again, thus consuming twice the communication bandwidth. Furthermore, this additional bandwidth is
consumed on the network connections between the physical server and external switches.

Virtual Ethernet Port Aggregator (VEPA)

The IEEE 802.1 Work Group has agreed to base the IEEE 802.1Qbg EVB standard on VEPA
technology because of its minimal impact and minimal changes to NICs, bridges, existing standards,
and frame formats (which require no changes).
VEPA is designed to incorporate and modify existing IEEE standards so that most existing NIC and
switch products could implement VEPA with only a software upgrade. VEPA does not require new
tags and involves only slight modifications to VEB operation, primarily in frame relay support. VEPA
continues to use MAC addresses and standard IEEE 802.1Q VLAN tags as the basis for frame
forwarding, but changes the forwarding rules slightly according to the base EVB requirements. In
doing so, VEPA is able to achieve most of the goals envisioned for EVB without the excessive burden
of a disruptive new architecture such as VN-Tags.
Software-based VEPA solutions can be implemented as simple upgrades to existing software virtual
switches in hypervisors. As a proof-point, HP labs, in conjunction with University of California,
developed a software prototype of a VEPA-enabled software switch, and the adjacent external switch
that implements the hairpin turn. The prototype solution required only a few lines of code within the
Linux bridge module. The prototype performance, even though it was not fully optimized, showed the
VEPA to be 12% more efficient than the traditional software virtual switch in environments with
advanced network features enabled. 6
In addition to software-based VEPA solutions, SR-IOV NICs can easily be updated to support the VEPA
mode of operation. Wherever VEBs can be implemented, VEPAs can be implemented as well.
VEPA enables a discovery protocol, allowing external switches to discover ports that are operating in
VEPA mode and exchange information related to VEPA operation. This allows the full benefits of
network visibility and management of the virtualized server environment.
There are many benefits from using VEPA:
• A completely open (industry-standard) architecture without proprietary attributes or formats
• Tag-less architecture that achieves better bandwidth than software-based virtual switches, with less
overhead and lower latency (especially for small packet sizes)
• Easy to implement, often as a software upgrade
• Minimizes changes to NICs, software switches, and external switches, thereby promoting low cost

Paul Congden, Anna Fischer, and Prasant Mohapatra, “A Case for VEPA: Virtual Ethernet Port Aggregator,” submitted for
conference publication and under review as of this writing.

Multichannel technology
During the EVB standards development process, scenarios were identified in which VEPA could be
enhanced with some form of standard tagging mechanism. To address these scenarios, an optional
“Multichannel” technology, complementary to VEPA, was proposed by HP and accepted by the IEEE
802.1 Work Group for inclusion into the IEEE 802.1Qbg EVB standard. Multichannel allows the
traffic on a physical network connection or port (like a NIC device) to be logically separated into
multiple channels as if they are independent, parallel connections to the external network. Each of the
logical channels can be assigned to any type of virtual switch (VEB, VEPA, and so on) or directly
mapped to any virtual machine within the server. Each logical channel operates as an independent
connection to the external network.
Multichannel uses existing Service VLAN tags (“S-Tags”) that were standardized in IEEE 802.1ad,
commonly referred to as the “Provider Bridge” or “Q-in-Q” standard. Multichannel technology uses
the extra S-Tag and incorporates VLAN IDs in these tags to represent the logical channels of the
physical network connection. This mechanism provides varied support:
• Multiple VEB and/or EVB (VEPA) virtual switches to share the same physical network connection to
external networks. Administrators may need certain virtualized applications to use VEB switches for
their performance and may need other virtualized applications to use EVB (VEPA) switches for their
network manageability, all in the same physical server. For these cases, multichannel capability lets
administrators establish multiple virtual switch types that share a physical network connection to the
external networks.
• Directly mapping a virtual machine to a physical network connection or port while allowing that
connection to be shared by different types of virtual switches. Many hypervisors support directly
mapping a physical NIC to a virtual machine, but then only that virtual machine may make use of
the network connection/port, thus consuming valuable network connection resources. Multichannel
technology allows external physical switches to identify which virtual switch, or direct mapped
virtual machine, traffic is coming from, and vice-versa.
• Directly mapping a virtual machine that requires promiscuous mode operation (such as traffic
monitors, firewalls, and virus detection software) to a logical channel on a network
connection/port. Promiscuous mode lets a NIC forward all packets to the application, regardless of
destination MAC addresses or tags. Allowing virtualized applications, to operate in promiscuous
mode lets the administrator use the external switch’s more powerful promiscuous and mirroring
capabilities to send the appropriate traffic to the applications, without overburdening the resources
in the physical server.
Figure 10 illustrates how Multichannel technology supports these capabilities within a virtualized

Figure 10. Multichannel technology supports VEB, VEPA, and direct mapped VMs on a single NIC

The optional Multichannel capability requires S-Tags and “Q-in-Q” operation to be supported in the
NICs and external switches, and, in some cases, it may require hardware upgrades, unlike the basic
VEPA technology, which can be implemented in almost all current virtual and external physical
switches. Multichannel does not have to be enabled to take advantage of simple VEPA operation.
Multichannel merely enables more complex virtual network configurations in servers using virtual
Multichannel technology allows IT architects to match the needs of their application requirements with
the design of their specific network infrastructure: VEB for performance of VM-to-VM traffic; VEPA/EVB
for management visibility of the VM-to-VM traffic; sharing physical NICs with direct mapped virtual
machines; and optimized support for promiscuous mode applications.

VN-Tag technology from Cisco specifies a new Ethernet frame tag format that is not leveraged from,
or built on, any existing IEEE defined tagging format. A VN-Tag specifies the virtual machine source
and destination ports for the frame and identifies the frame’s broadcast domain. The VN-Tag is
inserted and stripped by either VN-Tag enabled software virtual switches, or by a VN-Tag enabled
SR-IOV NIC. External switches enabled with VN-Tag use the tag information to provide network
visibility to virtual NICs or ports in the VMs and to apply network policies.
Although using VN-Tags to create an EVB solves the problems of management visibility,
implementating VN-Tags has significant shortcomings:
• VN-Tags do not leverage other existing standard tagging formats (such as IEEE 802.1Q, IEEE
802.1ad, or IEEE 802.1X tags)
• Hardware changes to NIC and switch devices are required rather than simple software upgrades
for existing network devices. In other words, using VN-tags requires new network products (NICs,
switches, and software) enabled with VN-Tags.

• EVB implemented with VN-Tag and traditional VEB cannot coexist in virtualized servers as with the
VEPA with Multichannel technology

M-Tag and port extension

One scenario is not addressed by VEPA and Multichannel technology: port extension. Port Extenders
are physical switches that have limited functionality and are essentially managed as a line card of the
upstream physical switch. Products such as the Cisco Nexus Fabric Extenders and Universal
Computing System (UCS) Fabric Extenders are examples of Port Extenders. Generally, the terms “Port
Extenders” and “Fabric Extenders” are synonymous. Port Extenders require tagged frames; they use
the information in these new tags to map the physical ports on the Fabric Extenders as virtual ports on
the upstream switches and to control how they forward packets to or from upstream Nexus or UCS
switches and replicate broadcast or multicast traffic.
Initially, the port extension technology was considered as part of the EVB standardization effort;
however, the IEEE 802.1 Work Group has decided that the EVB and port extension technologies
should be developed as independent standards. Cisco VN-Tag technology has been proposed to the
IEEE 802.1 Work Group as not only a solution for EVB, but also for use as the port extension tags.
However, as the issues with VN-Tags have been debated in the work group, Cisco has modified its
port extension proposal to use a tag format called “M-Tags” for the frames communicated between
the Port Extenders and their upstream controlling switches. At the time of this writing, it is unclear
whether these M-Tags will incorporate the VN-Tag format or other, more standard tag formats already
defined in the IEEE.

Status of EVB and related standards

As of this writing, the project authorization request (PAR) for the IEEE 802.1Qbg EVB standard has
been approved and the formal standardization process is beginning. The IEEE 802.1 Work Group
chose the VEPA technology proposal over the VN-Tag proposal as the base of the EVB standard
because of its use of existing standards and minimal impact and changes required to existing network
products and devices. Multichannel technology is also being included as an optional capability in the
EVB standard to address the cases in which a standard tagging mechanism would enhance basic
VEPA implementations.
Also as of this writing, the PAR for the IEEE 802.1Qbh Bridge Port Extension standard has been
approved and the formal standardization process is beginning. The IEEE 802.1 Work Group
accepted the M-Tag technology proposal as the base of the Port Extension standard. Again, it is
unclear whether M-Tags will resemble VN-Tags or a standard tag format.

It should be noted that there are a number of products on the market today, or that will be introduced
shortly, that rely on technologies that have not been adopted by the IEEE 802.1 Work Group.
Customers in the process of evaluating equipment should understand how a particular vendor’s
products will evolve to become compliant with the developing EVB and related standards

The server-to-network edge is a complicated place in both the physical and virtual infrastructures.
Emerging converged network (iSCSI, FCoE, CEE/DCB) and virtual I/O (EVB, VEPA, and VEB)
technologies will have distinct, but overlapping effects on data center infrastructure design. As the
number of server deployments with virtualized I/O continues to increase, the ability to converge
physical network architectures should interact seamlessly with virtualized I/O. HP expects that these
two seemingly disparate technologies are going to become more and more intertwined as data center
architectures move forward.

The goal of converged networks is to simplify the physical infrastructure through server I/O
consolidation. HP has been working to solve the problems of complexity at the server edge for some
time, with, for example, these products and technologies:
• HP BladeSystem c-Class –eliminates server cables through the use of mid-plane technology between
server blades and interconnect modules.
• HP Virtual Connect – simplifies connection management by putting an abstraction layer between the
servers and their external networks, allowing administrators to wire LAN / SAN connections once
and streamline the interaction between server, network, and storage administrative groups.
• HP Virtual Connect Flex10 – consolidates Ethernet connections, allowing administrators to partition
a single 10 Gb Ethernet port into multiple individual server NIC connections. By consolidating
physical NICs and switch ports, Flex-10 helps organizations make more efficient use of available
network resources and reduce infrastructure costs.
• HP BladeSystem Matrix – implements the HP Converged Infrastructure strategy by using a shared
services model to integrate pools of compute, storage, and network resources. The Matrix
management console, built on HP Insight Dynamics, combines automated provisioning, capacity
planning, disaster recovery, and a self-service portal to simplify the way data center resources are
managed. HP BladeSystem Matrix embodies HP’s vision for data center management: management
that is application-centric, with the ability to provision and manage the infrastructure to support
business-critical applications across an enterprise.

FCoE, CEE, and current-generation iSCSI are standards to achieve the long-unrealized potential of a
single unified fabric. With respect to FCoE in particular, there are important benefits to begin
implementing a converged fabric at the edge of the server network. By implementing converged
fabrics at the server-to-network edge first, customers gain benefits from reducing the physical number
of cables, adapters, and switches at the edge where there is the most traffic congestion and the most
associated network equipment requirements and costs. In the future, HP plans to broaden Virtual
Connect Flex-10 technology to provide solutions for converging different network protocols with its HP
FlexFabric technology. HP plans to deliver the FlexFabric vision by converging the technology,
management tools, and partner ecosystems of the HP ProCurve and Virtual Connect network portfolios
into a virtualized fabric for the data center.
Virtual networking technologies are becoming increasingly important (and complex) because of the
growing virtual networking infrastructure supporting virtual machine deployment and management.
Virtual I/O technologies can be implemented in the software space by sharing physical I/O among
multiple virtual servers (vSwitches). They can also be implemented in hardware by abstracting and
partitioning I/O among one or more physical servers. New virtual I/O technologies to consider
should be based on industry standards like VEPA and VEB, and should work within existing customer
frameworks and organizational roles, not disrupt them. Customers will have a choice when it comes
to implementation – for performance (VEB) or for management ease (VEPA/EVB).
Customers should understand how these standards and technologies intertwine at the edge of the
network. HP is not only involved in the development of standards to make the data center
infrastructure better. HP has already developed products that work to simplify the server-to-network
edge by reducing cables and physical infrastructure (BladeSystem), making the management of
network connectivity easier (Virtual Connect) and aggregating network connectivity (VC Flex-10). HP
FlexFabric technology will let an IT organization exploit the benefits of server, storage, and network
virtualization going forward. As the number of server deployments using virtual machines continues to
increase, the nature of I/O buses and adapters will continue to change. HP is positioned well to
navigate these changes because of the company’s skill set and intellectual property it holds in servers,
compute blades, networking, storage, virtualized I/O, and management.
As the data center infrastructure changes with converged networking and virtual I/O, HP provides the
mechanisms to manage a converged infrastructure by simplifying configuration and operations.

Through management technologies such as those built into the HP BladeSystem Matrix, customers can
use consistent processes to configure, provision, and manage the infrastructure, regardless of the type
of technologies at the server-to-network edge: FCoE, iSCSI, VEB virtual switches, or EVB virtual switch
technologies. Customers can choose the converged infrastructure and virtualization technologies that
best suit their needs, and manage them with consistent software tools across the data center.

Appendix: Understanding SR-IOV and Virtual Connect Flex-
The following information is taken from “HP Flex-10 and SR-IOV—What is the difference?” ISS
Technology Update, Volume 8, Number 3, April 2009:

Improving I/O performance

As virtual machine software enables higher efficiencies in CPU use, these same efficiency enablers
place more overhead on physical assets. HP Virtual Connect Flex-10 Technology and Single Root I/O
Virtualization (SR-IOV) both share the goal of improving I/O efficiency without increasing the
overhead burden on CPUs and network hardware. Flex-10 and SR-IOV technologies accomplish this
goal through different approaches. This article explores the differences in architecture and
implementation between Flex-10 and SR-IOV.

HP Virtual Connect Flex-10 Technology

As described in the body of this technology brief, HP Virtual Connect Flex-10 technology is a
hardware-based solution that enables users to partition a 10 gigabit Ethernet (10GbE) connection and
control the bandwidth of each partition in increments of 100 Mb.
Administrators can configure a single BladeSystem 10 Gb network port to represent multiple physical
network interface controllers (NICs), also called FlexNICs, with a total bandwidth of 10 Gbps. These
FlexNICs appear to the operating system (OS) as discrete NICs, each with its own driver. While the
FlexNICs share the same physical port, traffic flow for each one is isolated with its own MAC address
and virtual local area network (VLAN) tags between the FlexNIC and VC Flex-10 interconnect
module. Using the VC interface, an administrator can set and control the transmit bandwidth
available to each FlexNIC.

Single Root I/O Virtualization (SR-IOV)

The ability of SR-IOV to scale the number of functions is a major advantage. The initial Flex-10
offering is based on the original PCIe definition that is limited to 8 PCI functions per given device (4
FlexNICs per 10Gb port on a dual port device). With SR-IOV, there is a function called Alternative
Route Identifiers (ARI) that allows expansion of up to 256 PCIe functions. The scalability inherent in
the SR-IOV architecture has the potential to increase server consolidation and performance. Another
SR-IOV advantage is the prospect of performance gains achieved by removing the hypervisor’s
vSwitch from the main data path. Hypervisors are essentially another operating system providing
CPU, memory, and I/O virtualization capabilities to accomplish resource management and data
processing functions. The data processing functionality places the hypervisor squarely in the main
data path. In current I/O architecture, data communicated to and from a guest OS is routinely
processed by the hypervisor. Both outgoing and incoming data is transformed into a format that can
be understood by a physical device driver. The data destination is determined, and the appropriate
send or receive buffers are posted. All of this processing requires a great deal of data copying, and
the entire process creates serious performance overhead. With SR-IOV, the hypervisor is no longer
required to process, route, and buffer both outgoing and incoming packets. Instead, the SR-IOV
architecture exposes the underlying hardware to the guest OS, allowing its virtual NIC to transfer data
directly between the SR-IOV NIC hardware and the guest OS memory space. This removes the
hypervisor from the main data path and eliminates a great deal of performance overhead. The
hypervisor continues to allocate resources and handle exception conditions, but it is no longer
required to perform routine data processing. Since SR-IOV is a hardware I/O implementation, it also
uses hardware-based security and quality of service (QoS) features incorporated into the physical host

Advantage of using Virtual Connect Flex-10
Flex-10 technology offers significant advantages over other 10 Gb devices that provide large
bandwidth but do not provide segmentation. The ability to adjust transmit bandwidth by partitioning
data flow makes 10GbE more cost effective and easier to manage. It is easier to aggregate multiple
1 Gb data flows and fully utilize 10 Gb bandwidth. The fact that Flex-10 is hardware based means
that multiple FlexNICs are added without the additional processor overhead or latency associated
with server virtualization (virtual machines). Significant infrastructure savings are also realized since
additional server NIC mezzanine cards and associated interconnect modules may not be needed.
See for more
detailed information on Flex-10. It is important to note that Flex-10 is an available HP technology for
ProLiant BladeSystem servers, while SR-IOV is a released PCI-SIG specification at the beginning of the
execution and adoption cycle. As such, SR-IOV cannot operate in current environments without
significant changes to I/O infrastructure and the introduction of new management software. Once
accomplished, these changes in infrastructure and software would let SR-IOV data handling operate
in a much more native and direct manner, reducing processing overhead and enabling highly
scalable PCI functionality.
For more information on the SR-IOV standard and industry support for the standard, go to the
Peripheral Component Interconnect Special Interest Group (PCI-SIG) site:

For more information
For additional information, refer to the resources listed below.

Resource description Web address

HP Industry Standard Server technology

HP Virtual Connect technology

IEEE 802.1 Work Group website

IEEE Data Center Bridging standards:

• 802.1Qbb – Priority-based Flow
• 802.1Qau – Quantized
Congestion Notification
• 802.1Qaz – Data Center
Bridging Exchange Protocol,
Enhanced Transmission
IEEE Edge Virtual Bridging standard
INCITS T11 Home Page (FCoE standard)

Call to action
Send comments about this paper to

Follow us on Twitter:

© 2010 Hewlett-Packard Development Company, L.P. The information contained herein is subject to
change without notice. The only warranties for HP products and services are set forth in the express
warranty statements accompanying such products and services. Nothing herein should be construed as
constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions
contained herein.
Linux is a U.S. registered trademark of Linus Torvalds.
TC100301TB, March 2010