Section 2
• VXLAN Communication Modes
Section 3
• VTEP Control Plane and Packets Walks
VXLAN CONFIDENTIAL 2 | 38
Introduction to Virtual
Extensible LAN
Section 1
NSX Logical Switching
Logical Switch 1 Logical Switch 2 Logical Switch 3
VMware NSX
Challenges Benefits
• Per Application/Multi-tenant segmentation • Scalable Multi-tenancy across data
• VM Mobility requires L2 everywhere center
• Large L2 Physical Network Sprawl – STP • Enabling L2 over L3 Infrastructure
Issues • Overlay Based with VXLAN, STT, GRE,
• HW Memory (MAC, FIB) Table Limits etc,
• Logical Switches span across Physical
Hosts and Network Switches
VXLAN CONFIDENTIAL 4 | 38
VXLAN Version Dependencies
• vCloud Networking and Security 5.5 uses the existing VXLAN
implementation from 5.1
• This presentation is focused on NSX for vSphere
– vSphere 5.5 is required to leverage new capabilities introduced with NSX for
vSphere
– ESXi 5.1 is supported by NSX for vSphere using the previous
implementation
– Upgrades from 5.1 are supported
VXLAN CONFIDENTIAL 5 | 38
Virtual Extensible LAN
• Virtual Extensible LAN, VXLAN, is an IP overlay technology that
eliminates virtual network segmentation
– Allows network boundary devices to extend virtual network boundaries over
physical IP networks
– Expands the number of available logical Ethernet segments from 4094 to
2^24, or over 16 million logical segments
– Encapsulates the source Ethernet frame in a new UDP packet with a
destination port of 8472*
– VXLAN is transparent to virtual machines
– Adds 50 bytes to original frame
– Developed in conjunction with Arista, Broadcom, Cisco, Citrix, Red Hat, and
others
– Submitted to IETF for standardization
*In April of 2013 IANA reserved UDP port 4789 for VXLAN
VXLAN CONFIDENTIAL 6 | 38
VXLAN Frame Format
VXLAN Encapsulated Frame
Inner Ethernet Frame
14 bytes 20 bytes 8 bytes 8 bytes
VXLAN CONFIDENTIAL 7 | 38
Data Plane – NSX VXLAN Enhancements
• Support for multiple VXLAN vmknics per host
– Provides additional options for uplink load balancing
• Copying dscp and cos tag information from the inner Ethernet frame to
the VXLAN header
– Allows physical network to enforce QoS policy based on the inner Ethernet
frame’s marking
• Guest VLAN tagging
• vMotion callback
• Dedicated TCP/IP stack for VXLAN
• Network Adapter VXLAN Offloading capable
– Support for VXLAN offloading with NICs can supported
VXLAN CONFIDENTIAL 8 | 38
Control Plane – NSX VXLAN Enhancements
• Highly available and secure control plane to distribute VXLAN network
information to ESXi hosts
• Can operate in a non-Multicast configured physical network
– NSX can leverage multicast if it is configured in the physical network
VXLAN CONFIDENTIAL 9 | 38
VXLAN Terms
• A VXLAN Tunnel End Point, VTEP, is an entity that encapsulates an
Ethernet frame inside a VXLAN frame or decapsulates a VXLAN frame
and then forwards the inner Ethernet frame
• A VTEP Proxy is a VTEP that forwards VXLAN traffic in its local
segment from another VTEP in a remote segment
• A Transport Zone defines the members or VTEPs of the VXLAN
overlay
– Can include multiple vSphere hosts clusters
– A cluster can be part of multiple transport zones
VXLAN CONFIDENTIAL 10 | 38
Logical Switch
• The logical switch is a virtual network segment that has been identified
with a VNI
– Each logical switch gets its own unique VNI
– A VXLAN VDS portgroup gets created in all the VTEPs in the same
transport zone the logical switch is created
– Virtual machine’s vNICs get connected to logical switches
VXLAN CONFIDENTIAL 11 | 38
VDS Enhancements for NSX
• Preparing hosts for VXLAN on NSX Manager installs kernel and user
space modules
NSX Manager Controller Cluster
• Pushes VIBs • VTEP Table
• UI / API end point • MAC Table
• Controller info • ARP Table
• VXLAN into to • UTEP/MTEP per VXLAN
TCP over SSL
Controller
netcpa (uwa)
Controller Connections
socket
JSON over vsfwd
SSL
Core VXLAN Routing
User
vmklink
Kernel
VDS ESXi Host
VXLAN Routing
VXLAN CONFIDENTIAL 13 | 38
Questions
VXLAN CONFIDENTIAL 14 | 38
VXLAN Communication
Modes
Section 2
VXLAN Control Plane Modes
• Three Control Plane modes are supported in NSX for vSphere for
broadcast, multicast and unknown unicast, BUM:
– Unicast
– Hybrid
– Multicast
• Controller selects one VTEP per remote segment from VTEP table to
implement proxy
– In Unicast Mode VTEP implements to as UTEP (Unicast Tunnel End Point)
– In Multicast Mode VTEP implements to as MTEP (Multicast Tunnel End
Point)
• If a UTEP or MTEP leaves a VNI the Controller will select a new proxy
within the segment and update the participating VTEPs
VXLAN CONFIDENTIAL 16 | 38
Unicast Mode
• Source UTEP:
– Replicates encapsulated frame to each local VTEP via Unicast
– Replicates encapsulated frame to each remote UTEP via Unicast
• Destination UTEP:
– Receives the encapsulated frame from the source VTEP
– Replicates encapsulated frame to each local VTEP via Unicast
VXLAN CONFIDENTIAL 17 | 38
Unicast Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001
Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host
VXLAN CONFIDENTIAL 18 | 38
Multicast Mode
• Source VTEP:
– Replicates encapsulated frame to each remote VTEP via Multicast
– Replicates encapsulated frame to each local VTEP via Multicast
VXLAN CONFIDENTIAL 19 | 38
Multicast Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001
Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host
VXLAN CONFIDENTIAL 20 | 38
Hybrid Mode
• Source MTEP:
– Replicates encapsulated frame to each remote MTEP via Unicast
– Replicates encapsulated frame to each local VTEP via Multicast
VXLAN CONFIDENTIAL 21 | 38
Hybrid Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001
Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host
VXLAN CONFIDENTIAL 22 | 38
Replicate Locally Bit
• The VXLAN header has been updated for NSX for vSphere control
plane mode support
– The new REPLICATE_LOCALLY bit is used when Unicast or Hybrid Modes
are selected
– The source UTEP or MTEP sets the REPLICATE_LOCALLY bit to 1
– The receiving UTEP or MTEP sets the REPLICATE_LOCALLY bit to 0 and
replicates the frame in the local segment towards all local VTEPs
• Local replication is done in software based on VXLAN
the control plane mode configured Header
8 bytes
– The If the proxy is an UTEP, the replication is
done locally by Unicast VXLAN VXLAN RSVD
RSVD
Flags NI
– If the proxy is a MTEP, the replication is done
8 bits 24 bits 24 bits 8 bits
locally by Multicast
RRRRILRR
VXLAN CONFIDENTIAL 23 | 38
Questions
VXLAN CONFIDENTIAL 24 | 38
VTEP Control Plane and
Packet Walks
Section 3
VTEP Report
• Each VTEP informs the NSX Controller of
each VNI that it is a member of; this is called Controller
the VTEP Report VXLAN Directory
Service
– The NSX Controller sends a copy of the full
VTEP table per VNI to each VTEP
– The NSX Controller uses the VTEP table MAC table
VXLAN CONFIDENTIAL 26 | 38
Controller Based VXLAN – VTEP Report (1)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3
Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
1
vSphere Host vSphere Host vSphere Host
VNI VTEP VNI VTEP NSX VNI VTEP
Controllers
5001 10.20.10.10 5001 10.20.20.11
Manager 5001 10.20.30.12
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 27 | 38
Controller Based VXLAN – VTEP Report (2)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3
Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
VNI VTEP
Controller maintains a
copy of the VTEP Table.
5001 10.20.10.12
10.20.10.11
10.20.10.10
2
VNI VTEP
5001 10.20.10.10
VXLAN CONFIDENTIAL 28 | 38
Controller Based VXLAN – VTEP Report (3)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3
Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
3
vSphere Host vSphere Host vSphere Host
VNI VTEP VNI VTEP NSX VNI VTEP
Controllers
5001 10.20.10.10 5001 10.20.20.11
Manager 5001 10.20.30.12
5001 VNI
10.20.30.12 VTEP
Yes
5001 10.20.10.10
VXLAN CONFIDENTIAL 29 | 38
MAC Report
• VTEPs will send a copy of every learned MAC address in each VNI
segment to the NSX Controller; This is called a MAC Report
– The MAC report includes the VNI, MAC
address and the VTEP IP that reported it Controller
VXLAN Directory
– If a unknown unicast frame is received by a Service
VTEP, the VTEP will send a MAC table
request to the NSX Controller for the
destination MAC address MAC table
• If the NSX Controller has the MAC address in
the MAC table, it will reply to the VTEP with the ARP table
information to where to forward the frame
• If the NSX Controller does not have the MAC VTEP table
address in the MAC table, the VTEP will then
flood the frame to other VTEPs
VXLAN CONFIDENTIAL 30 | 38
Controller Based VXLAN – MAC Report
VNI VM MAC VTEP
VM VM VNI VM MAC VTEP
5001 MAC1 10.20.10.10
Send VNI,VM 7
MAC Mapping
2 and VTEP IP to 6
Controller
3
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 31 | 38
IP Report
• The VTEPs will send a copy of each MAC address and IP mapping
they have; this is called the IP report
– The NSX Controller will create an ARP table Controller
VXLAN Directory
with the information in the IP report
Service
– The ARP table will include the MAC to IP
mapping and the VTEP’s IP that reported it
MAC table
– The VTEP intercepts all ARP request from
virtual machines ARP table
• If the VTEP does not have the MAC address in it’s
local ARP Table, the VTEP queries the NSX VTEP table
Controller for the information
• If the NSX Controller does not have the MAC
address in the ARP the logical switch will then
broadcast the frame to all virtual machines in the
same VNI that are running in the VTEP and to all
other VTEPs in the same VNI
VXLAN CONFIDENTIAL 32 | 38
Controller Based VXLAN – IP Report
IP1 IP VNI
VNI VMIP
VM IP VM
VM MAC
MAC
VM VM
2 5001
5001
IP1
IP1
MAC1
MAC1
VM VM VM VM
VNI VNI
IP MAC IP MAC
Controller
Send VM MAC, 7
IP Mapping and
2 VNI to Controller 6
3
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 33 | 38
Controller Based VXLAN – ARP Request
IP1 IP VNI VM IP VM MAC
VM VM
2 5001 IP1 MAC1
VNI VM IP VM MAC
Controller
5001 IP2 ???
MAC2
6 5
Send ARP Query Send ARP
to Controller Query
2 response
3
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 34 | 38
Communication After ARP Resolution (1)
DST SRC DST SRC
MAC MAC IP IP
VM MAC2 VM
MAC1 10.10.10.102 10.10.10.101
1 2
1 10.10.10.101/24
MAC1
10.10.10.102/24
MAC2 VXLAN 5001
Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 35 | 38
Communication After ARP Resolution (2)
VM VM
1 2
10.10.10.101/24 10.10.10.102/24
MAC1 MAC2 VXLAN 5001
Logical Switch
DST 10.20.10.10/24
SRC DST DST SRC
SRC DST VXLAN SRC
10.20.20.11/24 10.20.30.12/24
Original Frame
MAC
MACA
MAC IP MAC
MAC2
IP
MAC
MAC1
IP NI
10.10.10.102
IP
DST
MACB
SRC
10.10.10.101 DST SRC
2 MACC
MAC MAC IP IP
MACRT1 MACA 10.20.20.11 10.20.10.10 5001
MAC2 MAC1 10.10.10.102
vSphere Host A vSphere Host B10.10.10.101 vSphere Host C
Host A receives the frame and does a MAC Table and VTEP
Table lookup. Host A updates the MAC field of the original
frame, puts the original frame inside a VXLAN frame, and the
new frame gets sent to the Physical network.
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 36 | 38
Communication After ARP Resolution (3)
VM VM
1 2
10.10.10.101/24 10.10.10.102/24
MAC1 MAC2 VXLAN 5001
Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
MACRT1 MACRT2
Physical Network
VXLAN CONFIDENTIAL 37 | 38
Questions
VXLAN CONFIDENTIAL 38 | 38