Sie sind auf Seite 1von 38

Logical Switching

© 2014 VMware Inc. All rights reserved.


Sections
Section 1
• Introduction to Virtual Extensible LAN

Section 2
• VXLAN Communication Modes

Section 3
• VTEP Control Plane and Packets Walks

VXLAN CONFIDENTIAL 2 | 38
Introduction to Virtual
Extensible LAN
Section 1
NSX Logical Switching
Logical Switch 1 Logical Switch 2 Logical Switch 3

VMware NSX
Challenges Benefits
• Per Application/Multi-tenant segmentation • Scalable Multi-tenancy across data
• VM Mobility requires L2 everywhere center
• Large L2 Physical Network Sprawl – STP • Enabling L2 over L3 Infrastructure
Issues • Overlay Based with VXLAN, STT, GRE,
• HW Memory (MAC, FIB) Table Limits etc,
• Logical Switches span across Physical
Hosts and Network Switches

LOGICAL SWITCHING –Using VXLAN to scale the Network

VXLAN CONFIDENTIAL 4 | 38
VXLAN Version Dependencies
• vCloud Networking and Security 5.5 uses the existing VXLAN
implementation from 5.1
• This presentation is focused on NSX for vSphere
– vSphere 5.5 is required to leverage new capabilities introduced with NSX for
vSphere
– ESXi 5.1 is supported by NSX for vSphere using the previous
implementation
– Upgrades from 5.1 are supported

VXLAN CONFIDENTIAL 5 | 38
Virtual Extensible LAN
• Virtual Extensible LAN, VXLAN, is an IP overlay technology that
eliminates virtual network segmentation
– Allows network boundary devices to extend virtual network boundaries over
physical IP networks
– Expands the number of available logical Ethernet segments from 4094 to
2^24, or over 16 million logical segments
– Encapsulates the source Ethernet frame in a new UDP packet with a
destination port of 8472*
– VXLAN is transparent to virtual machines
– Adds 50 bytes to original frame
– Developed in conjunction with Arista, Broadcom, Cisco, Citrix, Red Hat, and
others
– Submitted to IETF for standardization

*In April of 2013 IANA reserved UDP port 4789 for VXLAN
VXLAN CONFIDENTIAL 6 | 38
VXLAN Frame Format
VXLAN Encapsulated Frame
Inner Ethernet Frame
14 bytes 20 bytes 8 bytes 8 bytes

Outer Outer Inner Inner Optional Optional Original


Outer IP VXLAN EtherType Ethernet
Ethernet UDP Dest Sourc 802.1Q Inner FCS
Header Header
Header Header MAC e MAC EtherType 802.1Q Payload

*IP Header Data = Version, IHL,


TOS, Length, ID

IP IP Header Outer Outer


Header Protoco Checksum Source Dest Source Dest Port UDP UDP
Data* IP Port Length Checksum
l IP

Outer Outer Optional Optional VXLAN


802.1Q EtherType VXLAN RSVD
Dest Source Outer RSVD NI
EtherType Flags
MAC MAC 802.1Q (VNI)

VXLAN CONFIDENTIAL 7 | 38
Data Plane – NSX VXLAN Enhancements
• Support for multiple VXLAN vmknics per host
– Provides additional options for uplink load balancing

• Copying dscp and cos tag information from the inner Ethernet frame to
the VXLAN header
– Allows physical network to enforce QoS policy based on the inner Ethernet
frame’s marking
• Guest VLAN tagging
• vMotion callback
• Dedicated TCP/IP stack for VXLAN
• Network Adapter VXLAN Offloading capable
– Support for VXLAN offloading with NICs can supported

VXLAN CONFIDENTIAL 8 | 38
Control Plane – NSX VXLAN Enhancements
• Highly available and secure control plane to distribute VXLAN network
information to ESXi hosts
• Can operate in a non-Multicast configured physical network
– NSX can leverage multicast if it is configured in the physical network

• Suppression of Broadcast traffic


– Has ARP directory service and cache

VXLAN CONFIDENTIAL 9 | 38
VXLAN Terms
• A VXLAN Tunnel End Point, VTEP, is an entity that encapsulates an
Ethernet frame inside a VXLAN frame or decapsulates a VXLAN frame
and then forwards the inner Ethernet frame
• A VTEP Proxy is a VTEP that forwards VXLAN traffic in its local
segment from another VTEP in a remote segment
• A Transport Zone defines the members or VTEPs of the VXLAN
overlay
– Can include multiple vSphere hosts clusters
– A cluster can be part of multiple transport zones

• A VXLAN Number Identifier, VNI, is a 24 bit number that gets added to


the VXLAN frame
– It uniquely identifies the segment the inner Ethernet frame belongs to
– Multiple different VNIs can exist in the same transport zone
– NSX for vSphere starts with VNI 5000

VXLAN CONFIDENTIAL 10 | 38
Logical Switch
• The logical switch is a virtual network segment that has been identified
with a VNI
– Each logical switch gets its own unique VNI
– A VXLAN VDS portgroup gets created in all the VTEPs in the same
transport zone the logical switch is created
– Virtual machine’s vNICs get connected to logical switches

• Logical switches support mobility and availability features in vSphere


such as
– vMotion
– VM HA

VXLAN CONFIDENTIAL 11 | 38
VDS Enhancements for NSX
• Preparing hosts for VXLAN on NSX Manager installs kernel and user
space modules
NSX Manager Controller Cluster
• Pushes VIBs • VTEP Table
• UI / API end point • MAC Table
• Controller info • ARP Table
• VXLAN into to • UTEP/MTEP per VXLAN
TCP over SSL
Controller

netcpa (uwa)
Controller Connections

socket
JSON over vsfwd
SSL
Core VXLAN Routing
User
vmklink
Kernel
VDS ESXi Host
VXLAN Routing

vSphere Distributed Switch and NSX CONFIDENTIAL 12 | 38


vSphere Cluster – VXLAN Preparation
• The preparation of a vSphere cluster to support VXLAN is divided into
two steps:
– Install
• The NSX manager pushes the hypervisor kernel modules to the cluster’s ESXi
hosts
• The hypervisor kernel modules enable to ESXi hosts to support VXLAN, the logical
switch, the distributed router and the distributed firewall
– Configure
• Creates the VXLAN vmknic interfaces that will be used as VTEP interfaces
– It is recommended that a MTU size of 1600 be used to allow for the VXLAN overhead of 50
bytes
• Segment ID or VNI
• Transport Zone defines the members or VTEPs of the VXLAN overlay
– Can include ESXi hosts from different vSphere clusters
– A cluster can be part of multiple transport zones

VXLAN CONFIDENTIAL 13 | 38
Questions

VXLAN CONFIDENTIAL 14 | 38
VXLAN Communication
Modes
Section 2
VXLAN Control Plane Modes
• Three Control Plane modes are supported in NSX for vSphere for
broadcast, multicast and unknown unicast, BUM:
– Unicast
– Hybrid
– Multicast

• Controller selects one VTEP per remote segment from VTEP table to
implement proxy
– In Unicast Mode VTEP implements to as UTEP (Unicast Tunnel End Point)
– In Multicast Mode VTEP implements to as MTEP (Multicast Tunnel End
Point)
• If a UTEP or MTEP leaves a VNI the Controller will select a new proxy
within the segment and update the participating VTEPs

VXLAN CONFIDENTIAL 16 | 38
Unicast Mode
• Source UTEP:
– Replicates encapsulated frame to each local VTEP via Unicast
– Replicates encapsulated frame to each remote UTEP via Unicast

• Destination UTEP:
– Receives the encapsulated frame from the source VTEP
– Replicates encapsulated frame to each local VTEP via Unicast

• Unicast Mode Considerations:


– No Multicast configuration needed on the physical network
– Higher overhead on the Source VTEP and UTEP
– Configurable per VNI during Logical Switch provisioning

VXLAN CONFIDENTIAL 17 | 38
Unicast Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001

vSphere Distributed Switch


VTEP1 10.20.10.10 VTEP2 10.20.10.11 VTEP3 10.20.11.10 VTEP4 10.20.11.11

Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host

UTEP VTEP UTEP VTEP

VXLAN Physical Transport Network


Unicast Traffic

VXLAN CONFIDENTIAL 18 | 38
Multicast Mode
• Source VTEP:
– Replicates encapsulated frame to each remote VTEP via Multicast
– Replicates encapsulated frame to each local VTEP via Multicast

• No UTEP nor MTEP Roles


• Multicast Mode Considerations:
– IGMP and IGMP Snooping configuration needed on the physical network
– Multicast address required over physical network
– Lowest overhead on the Source VTEP
– Configurable per VNI during Logical Switch provisioning

VXLAN CONFIDENTIAL 19 | 38
Multicast Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001

vSphere Distributed Switch


VTEP1 10.20.10.10 VTEP2 10.20.10.11 VTEP3 10.20.11.10 VTEP4 10.20.11.11

Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host

VTEP VTEP VTEP VTEP

VXLAN Physical Transport Network


Multicast Traffic

VXLAN CONFIDENTIAL 20 | 38
Hybrid Mode
• Source MTEP:
– Replicates encapsulated frame to each remote MTEP via Unicast
– Replicates encapsulated frame to each local VTEP via Multicast

• Destination MTEP Role:


– Receives the encapsulated frame from the source MTEP
– Replicates encapsulated frame to each local VTEP via Multicast

• Hybrid Mode Considerations:


– IGMP Snooping configuration needed on the physical network
– VTEPs will send IGMP Joins and IGMP Reports
– Multicast address required over physical network
– Configurable per VNI during Logical Switch provisioning

VXLAN CONFIDENTIAL 21 | 38
Hybrid Mode Example
VXLAN Transport Subnet A 10.20.10.0/24 VXLAN Transport Subnet B 10.20.11.0/24
VM
VM1 VM3 VM4
2
VXLAN 5001

vSphere Distributed Switch


VTEP1 10.20.10.10 VTEP2 10.20.10.11 VTEP3 10.20.11.10 VTEP4 10.20.11.11

Controller
vSphere Host vSphere Host Cluster vSphere Host vSphere Host

MTEP VTEP MTEP VTEP

VXLAN Physical Transport Network Unicast Traffic


Multicast Traffic

VXLAN CONFIDENTIAL 22 | 38
Replicate Locally Bit
• The VXLAN header has been updated for NSX for vSphere control
plane mode support
– The new REPLICATE_LOCALLY bit is used when Unicast or Hybrid Modes
are selected
– The source UTEP or MTEP sets the REPLICATE_LOCALLY bit to 1
– The receiving UTEP or MTEP sets the REPLICATE_LOCALLY bit to 0 and
replicates the frame in the local segment towards all local VTEPs
• Local replication is done in software based on VXLAN
the control plane mode configured Header

8 bytes
– The If the proxy is an UTEP, the replication is
done locally by Unicast VXLAN VXLAN RSVD
RSVD
Flags NI
– If the proxy is a MTEP, the replication is done
8 bits 24 bits 24 bits 8 bits
locally by Multicast
RRRRILRR

VXLAN CONFIDENTIAL 23 | 38
Questions

VXLAN CONFIDENTIAL 24 | 38
VTEP Control Plane and
Packet Walks
Section 3
VTEP Report
• Each VTEP informs the NSX Controller of
each VNI that it is a member of; this is called Controller
the VTEP Report VXLAN Directory
Service
– The NSX Controller sends a copy of the full
VTEP table per VNI to each VTEP
– The NSX Controller uses the VTEP table MAC table

generated from the VTEP report to select the


ARP table
proxy VTEPs
• Based on the control plane mode configured, the
VTEP table
proxies will be UTEPs or MTEPs

VXLAN CONFIDENTIAL 26 | 38
Controller Based VXLAN – VTEP Report (1)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3

Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
1
vSphere Host vSphere Host vSphere Host
VNI VTEP VNI VTEP NSX VNI VTEP
Controllers
5001 10.20.10.10 5001 10.20.20.11
Manager 5001 10.20.30.12

VTEPs register their VNIs


with controller.

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 27 | 38
Controller Based VXLAN – VTEP Report (2)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3

Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC

vSphere Host vSphere Host vSphere Host


VNI VTEP VNI VTEP NSX VNI VTEP
Controllers
5001 10.20.10.10 5001 10.20.20.11
Manager 5001 10.20.30.12

VNI VTEP

Controller maintains a
copy of the VTEP Table.
5001 10.20.10.12
10.20.10.11
10.20.10.10
2
VNI VTEP

5001 10.20.10.10

MACRT1 MACRT2 5001 10.20.20.11

Physical Network 5001 10.20.30.12

VXLAN CONFIDENTIAL 28 | 38
Controller Based VXLAN – VTEP Report (3)
VM VM VM
1 2 3
10.10.10.101/24 10.10.10.102/24 10.10.10.103/24
MAC1 MAC2 VXLAN 5001 MAC3

Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC
3
vSphere Host vSphere Host vSphere Host
VNI VTEP VNI VTEP NSX VNI VTEP
Controllers
5001 10.20.10.10 5001 10.20.20.11
Manager 5001 10.20.30.12

Controller sends a copy of VNI VTEP Proxy

the VTEP table to each 5001 10.20.10.10 Yes

VTEP. 5001 10.20.20.11 Yes

5001 VNI
10.20.30.12 VTEP
Yes

5001 10.20.10.10

MACRT1 MACRT2 5001 10.20.20.11

Physical Network 5001 10.20.30.12

VXLAN CONFIDENTIAL 29 | 38
MAC Report
• VTEPs will send a copy of every learned MAC address in each VNI
segment to the NSX Controller; This is called a MAC Report
– The MAC report includes the VNI, MAC
address and the VTEP IP that reported it Controller
VXLAN Directory
– If a unknown unicast frame is received by a Service
VTEP, the VTEP will send a MAC table
request to the NSX Controller for the
destination MAC address MAC table
• If the NSX Controller has the MAC address in
the MAC table, it will reply to the VTEP with the ARP table
information to where to forward the frame
• If the NSX Controller does not have the MAC VTEP table
address in the MAC table, the VTEP will then
flood the frame to other VTEPs

VXLAN CONFIDENTIAL 30 | 38
Controller Based VXLAN – MAC Report
VNI VM MAC VTEP
VM VM VNI VM MAC VTEP
5001 MAC1 10.20.10.10

1 MAC1 5 MAC2 VXLAN 5001


5001 MAC1 10.20.10.10

5001 MAC2 10.20.20.11

vSphere Distributed Switch


8 4
10.20.10.10 10.20.20.11 10.20.30.12

vSphere Host vSphere Host vSphere Host

VNI VM MAC VNI VM MAC


Controller
5001 MAC1 5001 MAC2

Send VNI,VM 7
MAC Mapping
2 and VTEP IP to 6
Controller
3

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 31 | 38
IP Report
• The VTEPs will send a copy of each MAC address and IP mapping
they have; this is called the IP report
– The NSX Controller will create an ARP table Controller
VXLAN Directory
with the information in the IP report
Service
– The ARP table will include the MAC to IP
mapping and the VTEP’s IP that reported it
MAC table
– The VTEP intercepts all ARP request from
virtual machines ARP table
• If the VTEP does not have the MAC address in it’s
local ARP Table, the VTEP queries the NSX VTEP table
Controller for the information
• If the NSX Controller does not have the MAC
address in the ARP the logical switch will then
broadcast the frame to all virtual machines in the
same VNI that are running in the VTEP and to all
other VTEPs in the same VNI

VXLAN CONFIDENTIAL 32 | 38
Controller Based VXLAN – IP Report
IP1 IP VNI
VNI VMIP
VM IP VM
VM MAC
MAC
VM VM
2 5001
5001
IP1
IP1
MAC1
MAC1

1 MAC1 5 MAC2 VXLAN 5001


5001 IP2 MAC2

vSphere Distributed Switch


8 4
10.20.10.10 10.20.20.11 10.20.30.12

vSphere Host vSphere Host vSphere Host

VM VM VM VM
VNI VNI
IP MAC IP MAC

5001 IP1 MAC1 5001 IP2 MAC2

Controller

Send VM MAC, 7
IP Mapping and
2 VNI to Controller 6
3

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 33 | 38
Controller Based VXLAN – ARP Request
IP1 IP VNI VM IP VM MAC
VM VM
2 5001 IP1 MAC1

1 MAC1 MAC2 VXLAN 5001


5001 IP2 MAC2

vSphere Distributed Switch


10.20.10.10 10.20.20.11 10.20.30.12
4

vSphere Host vSphere Host vSphere Host

VNI VM IP VM MAC
Controller
5001 IP2 ???
MAC2

6 5
Send ARP Query Send ARP
to Controller Query
2 response
3

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 34 | 38
Communication After ARP Resolution (1)
DST SRC DST SRC
MAC MAC IP IP
VM MAC2 VM
MAC1 10.10.10.102 10.10.10.101
1 2
1 10.10.10.101/24
MAC1
10.10.10.102/24
MAC2 VXLAN 5001

Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC

vSphere Host A vSphere Host B vSphere Host C


VM1 sends a frame to VM2, which is in
the same Logical Switch as VM1.
Note: Assumes VM1 already knows
VM2’s MAC.

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 35 | 38
Communication After ARP Resolution (2)
VM VM
1 2
10.10.10.101/24 10.10.10.102/24
MAC1 MAC2 VXLAN 5001

Logical Switch
DST 10.20.10.10/24
SRC DST DST SRC
SRC DST VXLAN SRC
10.20.20.11/24 10.20.30.12/24
Original Frame
MAC
MACA
MAC IP MAC

MAC2
IP
MAC

MAC1
IP NI

10.10.10.102
IP

DST
MACB
SRC
10.10.10.101 DST SRC
2 MACC
MAC MAC IP IP
MACRT1 MACA 10.20.20.11 10.20.10.10 5001
MAC2 MAC1 10.10.10.102
vSphere Host A vSphere Host B10.10.10.101 vSphere Host C
Host A receives the frame and does a MAC Table and VTEP
Table lookup. Host A updates the MAC field of the original
frame, puts the original frame inside a VXLAN frame, and the
new frame gets sent to the Physical network.

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 36 | 38
Communication After ARP Resolution (3)
VM VM
1 2
10.10.10.101/24 10.10.10.102/24
MAC1 MAC2 VXLAN 5001

Logical Switch
10.20.10.10/24 10.20.20.11/24 10.20.30.12/24
MACA MACB MACC

vSphere Host A vSphere Host B vSphere Host C

The physical switch has the destination


3 MAC address, MACRT1, in its MAC Table
and forwards it to the router.

DST SRC DST SRC VXLAN


Original Frame
MAC MAC IP IP NI

DST SRC DST SRC


MAC MAC IP IP
MACRT1 MACA 10.20.20.11 10.20.10.10 5001
MAC2 MAC1 10.10.10.102 10.10.10.101

MACRT1 MACRT2
Physical Network

VXLAN CONFIDENTIAL 37 | 38
Questions

VXLAN CONFIDENTIAL 38 | 38