2 Table of Contents Data Center IP Fabric ‘Building a strong Foundation’ What is ‘Network Virtualization’?VXLAN OverviewVXLAN Packet detailsVXLAN TerminologyVXLAN Host DiscoveryVXLAN BUM Traffic HandlingVXLAN Layer 2 & Layer 3 TerminologiesVXLAN Arista Architecture & VisionVXLAN RoadmapVXLAN Visbility
3 Data Center – ‘IP Fabric’ Building A Strong Foundation
4 Challenges with current network architecture OversubscriptionLegacy Data Center ModelPorts on devices are oversubscribed ~ 8:1Higher Oversubscription as traffic traverses north ~ 20:1North to SouthScalabilityScales up and not scales outDependent on specific hardware (mix & match)Not scalable to 40GbE / 100GbECostAs multiple layers, it can get $$$MobilityWhat happens if my “IP” changes?What happens if traffic pattern changes?Layer 2DomainLayer 2DomainLayer 2DomainLayer 2DomainLatencyMultiple points of management, rampant oversubscription, wasteful cost modelHigh latencyLow predictabilityMultiple points of management, rampant oversubscription, wasteful cost model
5 Data Center ‘IP Fabric’ Support for East/West 80:20 traffic patternScale up to 64-way ECMP Spine designsAll uplinks from ToR are Active/ActiveSupport 100’000s of host portsNon-blocking / Non-oversubscribed architectureDeploy L3 routing protocols between leaf & spine i.e. BGP, OSPF, or ISISEverything is only 3 hops away!Provide network mobility via ‘Overlay Network’
6 Arista – Spine/Leaf “IP Fabric” Architecture VTEP1IP FabricSpine TierLeaf TierA 1B 1A2B2Bare MetalServersStorageHYPERVISOR 1HYPERVISOR 2VTEP2VTEP3VTEP4Network core is an IP fabric laid out in a Leaf-Spine architecture running ECMP between the two tiersLeaf switches - Arista 7150-x or 7050Q-x models are deployed at the TOR connecting virtualized servers, bare-metal servers, storage arrays and other devicesSpine switches – Arista 7500’s are deployed at the coreRouting Protocol – Either EGP (BGP) or IGP (OSPF / ISIS) is run in the IP fabric
8 What is Network Virtualization? Network Virtualization is not the same as Server Virtualization!As the above figure demonstrates, server virtualization is the partitioning of physical server resources, such as memory, I/O, storage and CPU. These resources are confined to the physical construct of a single device and share little or no distributed state. However, it is not possible to partition a network switch’s CPU, ASIC, TCAM, and forwarding plane in the same way as a network switch shares distributed state information with other devices in order to build an efficient forwarding path through all the devices.
9 Physical Infrastructure i.e. Underlay Network Overlays v UnderlaysNetwork virtualization: ability to separate, abstract and decouple the physical topology from a ‘logical’ or ‘virtual’ topology by using encapsulated tunneling. This logical network topology is often referred to as an ‘Overlay Network’.Overlay NetworkPhysical Infrastructure i.e. Underlay NetworkVXLAN disassociates workloads from physical networks, allowing for possible transition to cloud based providers
10 Types of ‘Overlay’ Technologies Any Overlay technology uses Location & Identity separationLocationIdentityFabric PathVXLANOTVLISPUnderlay ProtocolIS-ISBGP, OSPF, IS-ISLocationSwitch-IDIP addressIdentityClient MACClient IP / MacIdentity LearningFloodingFlooding /Dynamic learningMapping DBVendor ProprietaryYesNonIntra & / or Inter DCIntraBothInter
12 Virtual Extensible Local Area Network (VXLAN) Ethernet in IP overlay networkEntire L2 frame encapsulated in UDP50 bytes of overheadInclude 24 bit VXLAN Identifier16 M logical networksVXLAN can cross Layer 3Tunnel between ESX hostsVMs do NOT see VXLAN IDIP multicast used for L2 broadcast/multicast, unknown unicastTechnology submitted to IETF for standardizationWith Arista, Vmware, Red Hat, Citrix, Cisco, and OthersOuter MACDAOuter MACSAInner MAC DAInnerMACSAOptional Inner 802.1QOriginal Ethernet PayloadOuter 802.1QOuter IP DAOuter IP SAOuter UDPVXLAN ID (24 bits)CRCVXLAN EncapsulationOriginal Ethernet Frame
13 Virtual eXtensible LAN: How does it work? VM-2/24VM-1/24Layer 2 Domain between the VMvWire- VNI 10VTEPVTEPSubnet-ASubnet-BSW VTEPEncap/DecapVXLAN VTEPHW VTEPEncap/DecapVXLAN FramesMAC&IP are UDP EncapsulatedVXLAN could also be termed a tunneling scheme to overlay Layer 2 networks on top of Layer 3 networks.The VXLAN tunnels are stateless, so each frame is encapsulated according to a set of rules.The end point of the tunnel (VTEP) is located within the hypervisor on the server which houses the VM.The VNI and VXLAN related tunnel/outer header encapsulation are known only to the VTEP - the VM never sees it!Note : it is possible that the VTEPs could also be on a physical switch or physical ESXserver and could be implemented in software or hardware.OutboundConsider a VM within a VXLAN overlay network. This VM is unaware of VXLAN.To communicate with a VM on a different host, it sends a MAC frame destined to the target as before.The VTEP on the physical host looks up the VNI to which this VM is associated. It then determines if the destination MAC is on the same segment.If so, an outer header comprising an outer MAC, outer IP address and VXLAN header are inserted in front of the original MAC frame.The final packet is transmitted out to the destination, which is the IP address of the remote VTEP connecting the destination VM addressed by the inner MAC destination address.InboundUpon reception, the remote VTEP verifies that the VNI is a valid one and is used by the destination VM. If so, the packet is stripped of its outer header and passed on to the destination VM. The destination VM never knows about the VNI or that the frame was transported with a VXLAN encapsulation.In addition to forwarding the packet to the destination VM, the remote VTEP learns the Inner Source MAC to outer Source IP address mapping.It stores this mapping in a table so that when the destination VM sends a response packet, there is no need for an "unknown destination" flooding of the response packet.Encapsulation at VTEP node is transparent to IP ECMP fabric
14 VXLAN Benefits Feature Benefits Eliminates current networking challenges in the way of on-demand, virtual environment:VLAN SprawlSingle fault domainsScalability beyond 4096 segmentsProprietary fabric solutionsIP mobilityPhysical cluster size and localityEnables multi-tenancy at scaleDecouples logical networks from physical infrastructure so that applications can be deployed without worrying about physical rack location, IP address or VLANBased on open and well known standards
15 VXLAN Use Cases Physical to Virtual internetworking Multi-hypervisor connectivity and integrationMulti-tenant Cloud environmentsHA clusters across failure domainsDynamic growthDynamic resource management
17 VXLAN Packet VXLAN is a MAC-in-IP encapsulation VNI – 24 bits – Therefore there can be 16 million VMs within the same domainThe VNI scopes the inner MAC frame originated by the individual VM. Thus, you could have overlapping MAC addresses across segments but never have traffic "cross over" since the traffic is isolated using the VNI qualifier.This qualifier is in an outer header envelope over the inner MAC frame originated by the VM.VXLAN frame formatThe Frame format is:Outer MAC header - Optional 802.1q VLAN tag Outer IP Header Outer UDP Header VXLAN Header Inner Ethernet Header Optional Inner 802.1a VLAN tag --> Payload(8 bytes)This is only for IPv4 frame format. IPv6 will be addressed in the future
18 VXLAN Header VXLAN Header is a 8 Byte field comprising of: Flags (8 Bits)VxLAN Network Identifier (VNI) (24 Bits)Reserved (24 & 8 Bits) – Always set to zero.Flags (8 Bits) – I flag is set to 1 for a valid VxLAN Network ID (VNI). The remaining 7 bits (designated "R") are reserved fields and set to zero.VxLAN Network Identifier (VNI) (24 Bits) – Used for identification of the individual VxLAN overlay network on which the communicating VMs are situated. VMs in different VxLAN overlay networks cannot communicate.Reserved (24 & 8 Bits) – Always set to zero.
22 VXLAN Terminology Explained VTEP: VXLAN Tunnel End PointVXLAN encapsulation and decapsulation happens at the VTEPVXLAN GatewayA device which bridges traffic from VXLAN and non-VXLAN environments.VXLAN gateways allow for physical and non virtualized devices to communicate with VXLAN networksA VXLAN gateway can be either a hardware or software deviceVNI: Virtual Network Identifier- a 24-bit number is also called the VXLAN segment ID. The system uses the VNI, along with the VLAN ID, to identify the appropriate tunnel.VXLAN Header – is an 8-byte header that contains the 24-bit VNI value. It lives in between the UDP header and the inner MAC frame being carried over the VTI.VTI: VTEP Tunnel Interface - a switchport linked to a UDP socket that can be shared between many VLANs. Packets bridged through a vlan into the VTI are sent out the UDP socket with a VXLAN header including a VNI. The socket is bound to a fixed local port, but is not connected to any particular destination port or IP address; logically, we use sendto() (not send()) to transmit VXLAN-encapsulated frames on the socket. Packets arriving on the VTI (via the UDP socket, based on their UDP destination port) are demultiplexed into a VLAN for bridging. A 24-bit VNI within the packet determines which VLAN the packet is mapped to for bridging.VXLAN Segment - is a Layer 2 overlay network over which VMs communicate. Only VMs within the same VXLAN segment can communicate with each other.
27 Where is my VM now? spine0 leaf1 leaf2 esx10 esx11 spine0: show vmtracer vxlanVNI-Name VNI #VTEPs Learning Mcast Group Status Subnet Auburn Flood Up /24foo Flood Up /24bar Flood Down /20spine0: show vmtracer vxlan vni AuburnVNI Name: AuburnVNI Segment ID:VTEP Type Status Inside Outside Learning Mcast Grp PIM-RP Switch Port ModelESX1 VMware Up VNICs Flood ar16 eth Sar24 Arista Up/GW Flood ar24 loop Sar22 Arista Up/Up 1 MAC/IPs Flood ar22 eth SESX4 VMware Up VNICs Flood ar2 eth Tspine0leaf1leaf2esx10esx11VNI ‘Test’:AubieWarEaglevshieldvm-tiger
28 Where is my VM now? 128.218.10.x 128.218.11.x spine0 leaf1 leaf2 esx1 spine0: show vmtracer interface vxlan AuburnVTEP: ESX1 Role: vSwitch Switch/Port: ar16.foo.com/eth15Name VNIC Status State IP Address Aubie Network Interface 1 Up/Up vMotion WarEagle Network Interface 2 Up/Up VM-FT-ABooBama Network Interface 1 Up/DownVTEP: ar24 Role: Router Switch/Port: ar24.foo.com/loopback0NAT/PAT Status #ARPs IP AddressNo Up/UpVTEP: ar22 Role: Port-VTEP Switch/Port: ar22.foo.com/eth2FQDN IP MAC VLAN Statusisilon16.foo.com ab-12-fe Up/Upspine0leaf1leaf2xxesx1esx11VNI ‘Test’:AubieWarEaglevshieldvm-tiger