2 Cisco Live 20134/6/2017Virtual Machine Fabric Extension (VM-FEX) Bringing the Virtual Machines Directly on the NetworkBRKCOM-2005Dan Hanson, Technical Marketing Manager, Data Center Group, CCIE #4482Timothy Ma, Technical Marketing Engineer, Data Center Group
3 The Session will Cover FEX Overview &History VM-FEX Introduction VM-FEX Operational ModelVM-FEX General Baseline on UCSVM-FEX with VMware on UCSVM-FEX with Hyper-V on UCSVM-FEX with KVM on UCSVM-FEX General Details on Nexus 5500Summary
5 Fabric Extender Evolution One NetworkParent Switch to Top of RackNetworkAdministratorFEX ArchitectureConsolidates network managementFEX managed as line card of parent switchUses Pre-standard IEEE 802.1BRIEEE BR*Many applications requiremultiple interfacesFEXToday*IEEE 802.1BR Pre-Standard
6 Fabric Extender Evolution Adapter FEXConsolidates multiple 1Gb interface into a single 10Gb interfaceExtends network into serverUses Pre-standard IEEE 802.1BROne NetworkParent Switch to AdapterNetworkAdministratorIEEE BR *Many applications requiremultiple interfacesFEXIEEE BR *Adapter FEXToday*IEEE 802.1BR Pre-Standard
7 Fabric Extender Evolution One NetworkVirtual Same As PhysicalVM-FEXConsolidates virtual and physical networkEach VM gets a dedicated port on switchUses Pre-standard IEEE 802.1BRNetworkAdministratorIEEE BR *FEXIEEE 802.1BR *IEEE 802.1BR *Adapter FEXHypervisorVM network managed byServeradministratorTodayVM-FEX*IEEE 802.1BR Pre-Standard
8 Fabric Extender Evolution One NetworkParent Switch to ApplicationSingle Point of ManagementFEX ArchitectureConsolidates network managementFEX managed as line card of parent switchAdapter FEXConsolidates multiple 1Gb interface into a single 10Gb interfaceExtends network into serverVM-FEXConsolidates virtual and physical networkEach VM gets a dedicated port on switchNetworkAdministratorIEEE BR*Manage network all the way to the OS interface – Physical and VirtualFEXIEEE BR*IEEE BR*HypervisorTodayAdapter FEXVM FEX* IEEE 802.1BR Pre-Standard
9 Key Architectural Component #1: VNTAG “Intra-Chassis” Bus HeaderFEX architectureLANApplication PayloadSwitchFrameTCPVNTAG Ether typeIPVNTAGFrameVNTAGDPDestination Virtual InterfaceEthernetLRverSource Virtual InterfaceFEXVNTAG mimics forwarding vectors inside a switchD: Direction, P: Unicast/Multicast, L: LoopPolicy associated with the Virtual Interface NOT portVLAN member ship, QoS, MTU, Rate limit etcEthertype – Defines the type field used to determine that a frame is carrying aVNTAG. This value is currently undefined, awaiting assignment from the IEEE.direction – A 0 indicates that the frame is sourced from an Interface Virtualizer (Adapter) to the VirtualInterface Switch. Likewise, a 1 indicates that the frame is sourced from the Virtual Interface Switch headed toone or more Interface Virtualizers.pointer – Indicates that Destination (v)interface is an index into a (v)interface list table; this case is used formulticast frame delivery. When direction is 0, these fields must be 0.Destination (v)interface – The DVIF select the downlink interface(s) that receive a frame when sent fromthe Nexus 5000 to Fabric Extender or the Interface Virtualizer. As such, they are evaluated when direction isset to 1. A 0 in pointer indicates that Destination (v)interface selects a single (v)interface, typically used forunicast frame delivery. A 1 in pointer indicates that Destination (v)interface is an index into a virtual interfacelist table; this case is used for multicast frame delivery. When direction is 0, these fields must be 0.looped – Indicates that the frame is headed back towards the Interface that it originated from (on the sameL2 network that it was injected). In this situation, the Fabric Extrender or Interface Virtualizer must ensure thatthe (v)interface indicated by Source (v)interface does not receive the frame; this exclusion occurs only whenboth direction and looped are set to 1.reserved – This field is reserved for future use.version – Indicates the version of the VNTAG protocol carried by this header. Allimplementations of future protocol extensions must be compatible with all those prior. Thecurrent and only, defined version is 0.Source (v)interface – The SVIF Indicates the (v)interface that sourced the frame. Source (v)Interface mustbe set appropriately for all frames from a Fabric Extender or Interface Virtualizer to the Nexus 5000 (directionset to 0). In addition, Source (v)interface may be set for frames headed towards the Fabric Extender or InterfaceVirtualizer (direction set to 1) as indicated by looped.
10 FEX Data ForwardingRevisiting Traditional Modular Switches (Example Catalyst 6500)Constellation Bus had 32 byte header for fabric switchingVast majority of modular switch vendors have an internal “Tag” for fabric communicationsOriginally, Centralized forwarding ASICsLine cards fed into these ASICs directlyWhen we needed higher performance – we added faster Switch Fabrics, and Distributed Forwarding Capabilities to systemWhat this really meant – adding more ASIC forwarding capacity to the system to minimize the number of devices a flow had to traverse
11 FEX Data Forwarding Decoupling the Modular Switch Think the original C6k Satellite Program for VSL and RSLThe Constellation Bus now is smaller header – 6 Byte VNtag headerCore to FEX technology and being standardized as 802.1BRThis is NOT a 1:1 mapping to VEPA/802.1bg which is designed to offer an enhanced forwarding mechanism between peer devices via a single upstream deviceKeep the ASIC counts for high performance but put them on the Central controlling switch instead of all these line cardsLatency and bandwidth were more a function of the layers of ASICs to traverse in a tree – rather than the location of these ASICs (the fiber/copper paths for a packet to propagate)Add protocols for configuration and firmware management of these remote cards (Satellite Control Protocol, Satellite Discovery Protocol)Allows us to get away from manual firmware code management per (remote) line-cardMove from Store-and-Forward behavior to Cut-Through switching to make latency actually better
12 Fabric Extension (FEX) Concept Virtualising the Network PortMulti-tier architectureFEX architectureSwitch port extended over Fabric ExtenderLANLANSwitchSwitchLogical SwitchSwitchFEXCollapse network tiers, fewer network management points
13 FEX Technology for Unified I/O Virtual Switch Ports, Cables, and NIC PortsMapping of Ethernet and FC Wires over EthernetService Level enforcementMultiple data types (jumbo, lossless, FC)Individual link-statesFewer CablesMultiple Ethernet traffic co-exist on same cableFewer adapters neededOverall less powerInteroperates with existing ModelsManagement remains constant for system admins and LAN/SAN adminsPossible to take these links further upstream for aggregationDCB EthernetBlade ManagementChannels (KVM, USB,CDROM, Adapters)IndividualEthernetsIndividualStorage (iSCSI, NFS, FC)
14 Key Architectural Component #2 : UCS VIC 256 PCIe devicesDevices can be vNICs or vHBAsEach device has a corresponding switch interfaceBandwidth 2x4x10 GbUses 4x10 Ether Channel, HW 40Gb CapablevNICs/vHBAs NOT limited to 10GbPCIe Gen-2 x 16Mezzanine and PCIe256 PCIe devicesvHBAsvNICvNICvNICvFCvEthvEthvEthDual 4x10Gb
16 Server Virtualization Issues 1. When VMs move across physical ports—the network policy must follow Live MigrationHypervisorHypervisor2. Must view or apply network/security policy to locally switched trafficPort Profile3. Need to maintain segregation of duties while ensuring non-disruptive operationsSecurityAdminServer AdminNetwork Admin
18 UCS VM-FEX Distributed Modular System Removing the Virtual Switching Infrastructure to a FEXUCS Fabric InterconnectParent SwitchLANSAN+MDSN7000/C6500UCS IOM-FEX=AccessLayer+UCS 6100Distributed Modular SystemCisco UCS VICUCS IOMUCS IOMDistributed Modular SystemUCS VIC. . .UCS VIC1160VM-FEX: Single Virtual-Physical Access LayerCollapse virtual and physical switching into a single access layerVM-FEX is a Virtual Line Card to the parent switchParent switch maintains all management & configurationVirtual and Physical traffic treated the sameAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOS
19 Extending FEX Architecture to the VMs Cascading of Fabric ExtendersVirtualized DeploymentVM-FEX architectureLANLANSwitchSwitchLogical SwitchLogical SwitchLogical SwitchSwitch port extendedover cascaded FabricExtenders to theVirtual MachineFEXFEXHypervisorHypervisorvSwitchAppOSVM-FEXAppAppAppOSOSOS
20 Nexus 5000/2000 VM-FEX Distributed Modular System Removing the Virtual Switching Infrastructure to a FEXNexus 5500 Parent SwitchLANSAN+MDSN7000/C6500Nexus 2000 FEX=AccessLayer+Nexus 5500Distributed Modular SystemCisco UCS VICNexus 2000Nexus 2000Distributed Modular System. . .UCS VICUCS VIC1160VM-FEX: Single Virtual-Physical Access LayerCollapse virtual and physical switching into a single access layerVM-FEX is a Virtual Line Card to the parent switchParent switch maintains all management & configurationVirtual and Physical traffic treated the sameAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOSAppOS
21 Nexus 5000 + Fabric Extender Single Access LayerNexus 5000 Parent SwitchLANSAN+=MDSCisco Nexus® 2000 FEXN7000/C6500Distributed Modular SystemAccessLayerN5000Distributed Modular SystemN2232. . .N2232112Distributed Modular SystemNexus 2000 FEX is a Virtual Line Card to the Nexus 5000Nexus 5000 maintains all management & configurationNo Spanning Tree between FEX & Nexus 5000Over 6000 production customers Over 5 million Nexus 2000 ports deployed
22 IEEE-802.1BR vs. IEEE802.1Qbg FEX based on IEEE 802.1BR VM- FEXLogical SwitchFEXSwitchFEX based on IEEE 802.1BRVEPA based on IEEE 802.1QbgManagement complexity: each VEPA is an independent point of managementDoesn’t support cascading Reflective Relay (used in basic VEPA)Vulnerable: ACLs based on source MAC (can be spoofed)Resource intensive: Hypervisor component consumes CPU cyclesInefficient bandwidth : separate copy of each Mcast and Bcast packets on the wireEase of management: one switch manages all Port Extenders (adapters/switches/virtual interfaces)Supports cascading of Port Extenders (multi-tier, single point of management)Virtual Machine aware FEXSecure: ACLs based on VN-TAGScalable: Mcast and Bcast replication performed in HW at line rateEfficient: no impact to server CPUVirtual Embedded Bridge (VEB) is an IEEE standard !It’s a common terminology that involves the interaction between virtual switching environments in a hypervisor and the first layer of the physical switching infrastructure.IEEE-802.1Qbh Par is dead !IEEE-802.1Qbh is alive on best way to become to be a standard, has been renamed to IEEE-802.1BR !VEPA Standard ...VEPA is a HP proprietary implementationIEEE-802.1Qbg Standard ?Qbg like BR, are in Sponsor Ballot Phase, very close to be finished and published !The EVB enhancements are following 2 different paths:802.1Qbg and 802.1BR.The two proposals are parallel efforts, meaning that both will become standards and both are "optional" for any product being IEEE compliant. The standards are finished, be published early next Year.
23 Deployments of Cisco’s FEX Technology Nexus 5000/5500/7000+ Nexus 2200UCS 6100/ IOM 2kB22H with Nexus 5500 (HP)Server RackRack ServerRack FEXFEXSwitchChassis FEXSwitch / FIBlade ServerBlade Server Chassis12UCS 6100/ VIC 1 or 2Nexus VIC P81EBlade/Rack Server AdapterOSFEXAdapter FEXSwitch / FI3124nPort 0UCS 6100/ VIC 1 or 2 + VM Mgmt LinkNexus VIC P81EvCenter/VMMManagement Plane IntegrationUCS ManagerVM HostHypervisorVMFEXVM-FEXSwitch / FIRedHat KVM412n
24 VM-FEX Operations Model Pre-Boot ConfigurationStep 1: PrebootUCS defined PCIe devices and enumerationsHost discovers PCIe devicesHypervisorHypervisor
25 VM-FEX Operational Model Defining “Port Profiles” on the UCS or Nexus 5000Port ProfilesDefinitionStep 1: PrebootUCS defined PCIe devices and enumerationsHost discovers PCIe devicesStep 2: Port ProfileFolder of Network Policy definedWEB AppsHRDBComplianceVLAN WebVLAN HRVLAN DBUCSM orNexus 5500VLAN CompHypervisorHypervisor
26 VM-FEX Operational Model Pushing Port Profiles to the Hypervisor SystemStep 1: PrebootUCS defined PCIe devices and enumerationsHost discovers PCIe devicesStep 2: Port ProfileFolder of Network Policy on UCS or Nexus 5500 definedStep 3: Port Profile ExportPort Profile name list exported to virtualization managerUCSM or Nexus 5500 exports Port ProfilesVLAN WebHypervisorManagerVLAN HRVLAN DBUCSM orNexus 5500VLAN CompHypervisorHypervisor
27 VM-FEX Operational Model Mapping of Port Profiles to VM Virtual AdaptersStep 1: PrebootUCS defined PCIe devices and enumerationsHost discovers PCIe devicesStep 2: Port ProfileFolder of Network Policy on UCS or Nexus 5500 definedStep 3: Port Profile ExportPort Profile name list exported to virtualization managerStep 4: VM DefinitionNamed Policy in VMVLAN WebHypervisorManagerVLAN HRNetworkManagerVLAN DBVLAN CompVMHypervisorHypervisorVMVMVM
28 VM-FEX Operational Model Simplifying the Access InfrastructurePhysical NetworkUnify the virtual and physical networkSame Port Profiles for various hypervisors and bare metal serversConsistent functions, performance, managementHypervisorHypervisorVirtual NetworkVETHVNICVMVMVMVMVMVMVMVMVM-FEX BasicsFabric Extender for VMsHypervisor vSwitch removedEach VM assigned a PCIe deviceEach VM gets a virtual port on physical switchVM-FEX: One NetworkCollapses virtual and physical switching layersDramatically reduces network management points by eliminating per host vSwitchVirtual and Physical traffic treated the sameHost CPU Cycles ReliefHost CPU cycles relieved from VM switchingI/O Throughput improvementsSpeaker notesVM-FEX:Even though Distributed virtual switch addresses some of the scaling concerns of virtual network (Eg – instead of having a network management point within each physical host you get to manage groups of physical hosts together in the Nexus 1Kv case upto 64 hosts), it adds a few new complicationsVirtual network management is opaque to the physical network managementFeatures available in your physical network infrastructure may not be consistent with the features available in your virtual network infrastructureWith VM-FEX collapses the physical and virtual network into “ONE” network.All VMs get assigned a PCIe device on the Virtual Interface Card.Virtual switching inside the hypervisor is removed<click>- The virtual port on the vSwitch (or vETH) that used to be inside the hypervisor now moves to the physical switch and the VM gets its own presence on the physical network using the vEth- Traffic between VMs on the same host need to go the physical switch.- VM-FEX eliminates the additional complexity introduced by virtual network Network management points are dramatically reduced – instead of managing a switch per host (or even one per 64 hosts in case of distributed switches) you manage a single switch that is responsible for both physical and virtual networking- Virtual and physical traffic is treated consistently. You don’t need 2 sets of policies for virtual and physical networkThe 2nd benefit from VM-FEX comes from the fact that switching is completely off-loaded to the physical switch’s ASIC. The Host CPU is relieved from this effort and we see improvements upto 12% for standard (emulated) mode and upto 30% for high performance (VMDirectPath or PCIe pass thru) mode.Lets talk about the different modes of VM-FEX now
29 VM-FEX Operational Model Traffic ForwardingPhysical NetworkRemoving performance dependencies from VM locationOffloading software switching functionalities from host CPUMore on this in upcoming slidesHypervisorHypervisorVETHVNICVMVMVMVMVMVMVMVM
31 VM-FEX Modes of Operation VMware ESXvSphere 5Emulated ModevEthdvNICdvNICVMDirectPathvSphere 5vEthHigh Performance ModeCo-exists with Standard modeBypasses Hypervisor layer~30% improvement in I/O performanceAppears as distributed virtual switch to hypervisorCurrently supported with ESXi 5.0 +LiveMigration supportedEmulated ModeEach VM gets a dedicated PCIe deviceAppears as distributed virtual switch to hypervisorLiveMigration supportedA standard vSwitch or regular DVS is an emulated switch /device. We offer VM-FEX in emulated mode but we are not emulating the switch, we are just doing a pass through, a PCIe device that is discovered as VM-FEX interface just pass through the hypervisor. We have emulated mode since ESX4. Each VM get a dedicate PCI device, when we say a dedicate device what really mean each VM-FEX interface would a dedicate hardware resource including queue counts, interrupt number and ring size. The VM-FEX emulate mode would present itself as stand DVS to ESXi but really it is just doing PCI mapping Customer who want the performance boost would get the benefit from DirectPath I/O, however there is a trade off and the trade off is those VM would loss all hypervisor feature such as DRS, FT or even basic as VMotion and become isolated. With VM-FEX technology, we are the only vendor that support vmotion with DirectPath I/O by simply allow virtual machine to go from bypass mode to emulated mode without service interruption and allow hypervisor to complete vmotion process All the vNIC we talk about here is dynamic vNICStandard Mode:In the standard the mode each VM is associated one to one with a PCIe device presented by the VIC. All VM traffic is sent to the upstream Fabric Interconnect for switchingIn our tests performance improvement over industry standard NICs was between 12%-15%From a functionality perspective – VM-FEX presents itself as a distributed network solution to the hypervisor and integrates with it seamlessly – standard functionality such as vMotion are all supported while enjoying the performance improvement benefits<click>High performance Mode:In the high performance mode the hypervisor is completely bypassed and VM I/O is directly placed on the VIC’s descriptors – this is possible since the VIC’s descriptors are a perfect match to the descriptors of the emulated NIC in the VM.The performance benefit comes primarily from the fact that one extra memory copy is eliminated –Typically buffers are copied from the VM’s memory to the hypervisor and finally placed on the physcial NICs descriptorsIn VMDirectPath mode – the buffers are placed directly to the VIC’s descriptors thereby eliminating the memory copy in the hypervisor layerIn our testing the I/O performance improvements are close to 30% and CPU performance improvements are upto 50% for I/O bound applications.Again from a functionality perspective this mode appears as a first class distributed virtual switch to the hypervisor and with ESX 4.1 U1 VMware is enabling vMotion with VMDirectPath – thereby enabling dynamic workload management.
32 VMDirectPath: How is Works Dynamic VIC deviceConfig:Used by PCI mgmt layerMgmt Bar:Used by Vmkernel and PTSData Bar :Vmxnet3 compliant Rings and RegistersVmxnet3OS PCIsubsystemControl PathEmulation <-> PT transitionsPort EventsPCIe eventsVMVmxnet3 DriverEthernet Device DriverCisco DVSData PathThere are several components I would like to introduce under VMware VM-FEX architecture.Leverage ESX native driver – VMXNET3 and DirectPath I/O feature with user configurable parameters such as RQ/TQ number, available interrupt and ring size. Depends on the type of Guest OS and its configuration such as enable RSS or not, will have different recommendation.A dedicate Cisco DVS will be installed on each ESX host through the form of vib file and its main purpose is to overwrite the default DVS function and enable the VM traffic to bypass hypervisor stack. As mentioned in previous slides, in emulation mode, VM traffic would still pass-through the hypervisor stack but offload the policy management piece to the bridge controller hardware. Whereas, in bypass mode, the VM traffic complete bypass the hypervisor stack and eliminate additional layer of memory copy.- Dynamic vNIC device is configured through UCSM Service Profile and present to the host OS as individual PCI device, however only the virtual machine interface would leave dynamic interface
33 UCS VM-FEX Modes of Operation Windows Hyper-V & Red Hat KVM with SR-IOVdvNICHyper-V 2012vEthVFSvNICPFHyper-V 2012vEthdvNICVFPFSvNICEmulated ModeHypervisor BypassHigh Performance ModeCo-exists with Standard modeBypasses Hypervisor layer~30% improvement in I/O performanceAppears as a Virtual Function to Guest OSCurrently supported through SR-IOV with Hyper-V 2012 & RHEL KVM 6.3Live Migration supportedEmulated ModeEach VM gets a dedicated PCIe deviceAppears as a Virtual Function to Guest OSLiveMigration supportedStart with Windows Server 2012 and Red Hat 6.3, we also support both emulate mode and hypervisor bypass mode with VM-FEX deployment. However the architecture is different, VM-FEX interface is no longer present itself as DVS but as a virtual function within the Guest OS. We leverage the industry SR-IOV stand to implement hyperv and redhat VM-FEX and our goal is the same, to achieve hypervisor pass for VM traffic with low packet latency and host CPU utilization but preserve the hypervisor functionality such Live Migration. The same methodology is used by converting back to emulated mode to complete Live Migration process and promoted back hypervisor bypass mode without service interruption.SR-IOV Primer (Delmar Functionality)SR-IOV – PCI-SIG standard that allows creation of lightweight PCI devices (VF – virtual functions) “under” a parent PCI device (PF – physical function).HyperV, KVM support SR-IOV. ESX does not.A VF is associated to a VM VNIC. PF is the host uplink.Intel VT-d provides hardware assisted techniques to allow direct DME transfers to and from VM (hypervisor bypass).VM-FEX extends VM VNIC connectivity to the FI via the VIC protocol.
34 Hyper-V Host Hyper-V Host SR-IOV: How is Works4/6/2017 2:13 PMHyper-V HostHyper-V HostRoot PartitionVirtual MachineRoot PartitionVirtual MachineHyper-V SwitchHyper-V SwitchVirtual FunctionVirtual NICSwitchingVLAN FilteringData CopySwitchingVLAN FilteringData CopyVMBUSSR-IOV Physical FunctionPhysical NICNetwork I/O path without SRIOVNetwork I/O path with SRIOVIn the traditional network data path, when a packet arrive in the underlying physical adapter, it would then do a memory copy into the Virtual Machine Manager (VMM) in this case, is the hyperV and hyperV vSwitch and based on the destination information such as MAC address or VLAN, the VMM or hypervisor would then do another memory copy and place data packet in the proper virtual machine queue for further process. As you could see, a lot of host resource is wasted through this double memory copying.In SR-IOV technology, we introduce two components,- Physical Functions (PFs): These are full PCIe devices that include the SR-IOV capabilities. PFs are discovered, managed and configured as normal PCIe devices which is configured as Static VNIC in UCSM- Virtual Functions (VFs): These are light-weight PCIe functions that contain minimal and dedicated resources necessary for data movement, such as queues and interrupts. A set of VFs are managed by a given PF. One or more VFs can be assigned to a Virtual Machine which is configured as dynamic VNIC in UCSM.With paring of PF and VF, now we move the virtual machine queue to the physical SR-IOV capable adapter such Palo or VIC1280 and with Intel Virtualization direct I/O technology, each date packet would now directly transfer or copy from the SRIOV enable physical adapter into the virtual machine adapter. The performance difference comes from eliminating the additional memory copy with Virtual machine manager / hypervisor to achieve higher I/O throughput, high CPU utilization and lower data latency.Microsoft Confidential
35 VM-FEX Operational Model Live Migration with Hypervisor BypassTemporary transition from to standard I/OHypervisorLive Migration to secondary hostvSphere 4HypervisorvNICvNICvNICvEthvEth1 sec silent periodvEth- Always use Live MigrationVM Sending TCP stream (1500MTU) UCS B200 M2 blades with UCS VIC card
36 VM-FEX Modes of Operation Enumeration vs. Hypervisor BypassVM-FEX ModeVMwareHyper-VKVMEmulationPass Through (PTS)Hyper-V SwitchSR-IOV with MacVTapHypervisor VersionESX 4.0 U1 +Window Server 2012RHLE 6.2 +UCSM Version1.4 +2.1VMotion / Live MigrationSupportVM-FEX ModeVMwareHyper-VKVMHypervisor BypassSR-IOVSR-IOV with PCI PassthroughHypervisor VersionESX 5.0 +Window Server 2012RHEL 6.3UCSM Version1.4 +2.1VMotion / Live MigrationSupportN/A
37 ESX Kernel Module / Libvirt / HyperV Extendable Switch UCS VM-FEX System ViewDeploying on a UCS B or C Series InfrastructureUCS Fabric Interconnect A (port profiles)UCS Fabric Interconnect B (port profiles)Fiber Channel Uplink PortsFiber Channel Uplink Ports785612125678MgmtUplinkveth10veth1veth2veth3veth4vfc0Virtual InterfaceControl LogicVirtual InterfaceControl Logicvfc1veth1veth2veth3veth4veth10VN 10GbeMgmtUplinkInternal Connections123456123456UCS 6x00 Physical PortsUCS 6x00 Physical PortsChassis IOM PortsChassis IOM Ports12341234Chassis IO Module AChassis IO Module BServer PortsVM-FEX1234567812345678Server PortsCisco AdapterVIC CPU1UCS B or C Series ServerCIMCKVM etc.vHBA0vNIC1(s)vNIC2(s)vHBA1HBA 0ESX Kernel Module / Libvirt / HyperV Extendable SwitchHBA 1For this slide, we are trying to show a system view of how individual component work and connect together to realize VM-FEX deployment. Apologize for this busy slides, I am actually not that artistic and inherent this slides from my boss. At the very bottom is the server view before implement VM-FEX, there is bare metal component including the adapter and hypervisor component which could ESX, Libvirt for KVM virtualziation library and Hyper-V extension switch. Next layer up, we have the IO Module where the VNTag comes into play between Fabric Interconnect and individual server adapter.At Fabric Interconnect Level, user configure desired number of static vNIC for Host OS and dynamic for VM interface through service profile. During server association and boot process, Fabric Interconnect instructs the adapter processor to enumerate the corresponding interface configuration through VIC protocol on internal connection. vHBA interface could be also generate through the same process to enable FCoE capability.
38 ESX Kernel Module / Libvirt / HyperV Extendable Switch UCS VM-FEX System ViewDeploying on a UCS B or C Series InfrastructureUCS Fabric Interconnect A (port profiles)UCS Fabric Interconnect B (port profiles)Fiber Channel Uplink PortsFiber Channel Uplink Ports785612125678MgmtUplinkveth10veth1veth2veth3veth4vfc0Virtual InterfaceControl LogicVirtual InterfaceControl Logicvfc1veth1veth2veth3veth4veth10VN 10GbeMgmtUplinkInternal Connections123456123456UCS 6x00 Physical PortsUCS 6x00 Physical PortsChassis IOM PortsChassis IOM Ports12341234Chassis IO Module AChassis IO Module BServer PortsVM-FEX1234567812345678Server PortsCisco AdapterVIC CPU1UCS B or C Series ServerCIMCKVM etc.vHBA0vNIC1(s)d-vNIC1d-vNIC2d-vNIC3d-vNIC4vNIC2(s)vHBA1HBA 0ESX Kernel Module / Libvirt / HyperV Extendable SwitchHBA 1One thing to note is that dynamic vNIC is slightly different from static vNIC in terms of fail over function. For static vNIC, failover between fabrics is optional but for dynamic vNIC fabric failover is requirement since we want to guarantee the policy consistency for the given virtual machine no mater where it moves within the UCS fabric.- Moreover, the pin group feature is also available to virtual machine with VM-FEXX, it is defined in the port profile that attach to the dynamic interface and you could traffic engineer the VM traffic and pin it to the specific boarder port on fabric interconnect to external network no matter where it Live Migrate within the domainAlso we only shows one to one matching between Virtual Machine and dynamic interface, in fact you could have multiple dynamic vNIC assigned to a single VM and configure the VM-FEX mode of operation individually.KernelServiceConsole
39 VM-FEX Scalability Number of VIF Supported per Hypervisor Cisco UCS 6100 / 6200 SeriesHypervisorHalf-Width Blade with Single VICFull-Width Blade with Dual VICESX 4.0 – 4.1 (DirectPath I/O)56 (54 vNIC + 2 vHBA)ESXi 5.0 – 5.1 (DirectPath I/O)116 (114 vNIC + 2 vHBA)116 (114 vNIC + 2 vHBA)*Windows 2012 (SR-IOV)232 (228 vNIC + 4 vHBA)KVM 6.1 – 6.3 (SR-IOV)* Additional VIC will NOT increase the total VIF count due to OS limitation* Multiple VIC is Supported for full width blade and B200M3Nexus SeriesHypervisorAdapter FEXVM-FEXESX 4.1 – ESXi 5.196 vNICThis slides provides the basic scalability information for VM-FEX deployment. These number are not the software / hardware limitation but more as QA testing effort. As newer UCSM version release, we are looking to increase the scalability matrix accordingly.Go through slides* Only one VIC (P81E/VIC1225) is Supported for each C series rack server
40 VM-FEX Advantage Simpler Deployments Robustness Performance Security Unifying the virtual and physical networkConsistency in functionality, performance and managementRobustnessProgrammability of the infrastructureTroubleshooting, traffic engineering virtual and physical togetherPerformanceNear bare metal I/O performanceImprove jitter, latency, throughput and CPU utilizationSecurityCentralized security policy enforcement at controller bridgeVisibility down to VM to VM trafficVM-FEX Advantage comparing to virtualized switch layerSo why we develop VM-FEX technology, obviously, Performance is main driver, however we are not just boost the VM throughput to bare metal level but in the same also improve packet jitter, latency and low host CPU utilization.With VM-FEX, you also have a programmable access layer which no longer depends on the physical port, you could decide the size of the DVS and how individual VM utilize these access ports through port profile definition.Lastly being able to have visibility for VM to VM traffic is another security advantage when deploy VM-FEXAlso being able to manage your virtual infrastructure as physical one is another advantage and offer a consistent functionality, performance and policy management.
42 UCS General Baseline #1: Dynamic vNICs Policy Setting a Dynamic Adapter Policy UpPolicies are to automatically provision dynamics on ServersDependent on the number of Fabric Interconnect to IO Module connections(# IOM to FI links * 63) - 2 for Gen 2 Hardware (62xx, 22xx and VIC12xx)(# IOM to FI links * 15) - 2 for Gen 1 Hardware (61xx, 21xx and Palo)In the next couple slides, we are going walk through several VM-FEX configuration baseline which is independent of the hypervisor, with the understanding of baseline setting, we will then diving into hypervisor specific configuration for more details.In order to get the benefits of VM-FEX, there are some trade off that we need to make and one of it is scalability. Comparing to software switching, there is a limit of hardware resource that is available for VM-FEX within single UCS domain. The number of available dynamic vNIC per server is one of it. As shown in this slides, based on the number of physical links between FI and IOM and the generation of hardware, we layout formulaat your reference to calculate the interface.Basically for generation 1 hardware, the maximum dynamic interface you could have is 58 and in generate 2 is 502 which actually exceeds the current support matrix of 116 per adapter and 232 per server.
43 UCS General Baseline #1: Creating Dynamic vNICs Fabric Interconnect VIF calculation 23456In port-channel mode, when you cable between FEX and the FI, the available virtual interface (VIF) namespace varies, depending on where the uplinks are connected to the FI ports:When port-channel uplinks from the FEX are connected only within a set of eight ports managed by a single chip, Cisco UCS Manager maximizes the number of VIFs used in service profiles deployed on the servers.If uplink connections are distributed across ports managed by separate chips, the VIF count is decreased. For example, if you connect seven members of the port channel to ports 1–7, but the eighth member to port 9, this port channel can only support VIFs as though it had one member.
44 UCS General Baseline #2: Building Service Profile Adding the Dynamic Policy and Static Adapters2 Statics – 1 to each UCS FabricChange dynamic vNIC connection policy to setup dynamicsAdditionally you would also noticed that there is no mac address association to dynamic vNIC during service profile configuration and the reason is that each dynamic vNIC would get assigned dedicated hardware resource but present as a vellina interface to hypervisor. The virtual machine manager such as vCenter would actually provide the mac address during VM creation
45 UCS General Baseline #2: Building Service Profile Static and Dynamic Adapter PolicyAdapter PolicyStatic vNICDynamic vNICVMware ESXiVMwareVMwarePassThruWindow Hyper-VSR-IOVWindowsRedHat KVMLinuxThe location and the adapter policy for dynamic vNIC are vary based the hypervisor. For VMware scenario, there is a parallel relationship between static and dynamic vNIC. Whereas, the PF and VF relationship would be form in service profile for Hyper-V and KVM deployment
46 UCS General Baseline #3: Building Port Profiles Creating Folders of Network Access AttributesCreating Port Profiles Includes:VLAN(s)Native and/or Tagging allowedQoS Weights and Flow RatesUpstream Ports to always useAs we discuss earlier, Port Profile is a logical container for your network and security policy, in this slides shows some of the feature that you could configure within a profile include what are vlan and vlans that this virtual machine would have access, do we enable CDP, what is the QoS policy, line rate or rate limited for bandwidth, what is the maximum port that could be associated to this port profile and last but the least, how do you want to traffic engineer the traffic
47 UCS General Baseline #4: Building Port Profiles Enhanced Options like VMDirectPath with VM-FEXSelecting High Performance will only Impact VMware deployment todayNo problem if selected and used on other hypervisorsWe only toggle the Host Network I/O Performance for Vmware deployment today. Port Profile with high performance enable will operate in bypass mode or otherwise in emulate mode
48 UCS General Baseline #5: Communication with Manager Establishing Communication to Hypervisor ManagerTool discussed later to simplify the whole integration processOnce we have port profile and network infrastructure lays out in UCSM, we need to find a way to propagate these information to virtual machine manager. For VMware in particular, we currently have a build-in wizard that guide you through the integration step.If you familiar or have experience with nexus 1000v, we use the mechanism to build a secure communication between UCSM and vCeter through the export of UCSM extension file, which is in XML format and would register in vCenter as a plug-in
49 UCS General Baseline #5: Communication with Manager 61006200Hypervisor ManagerDVSESXi1 vCenter per UCS doman1 DVS per vCenter4 DVS per vCenterHyper-V1 MMC Instance5 DVS per MMC Instance5 DVS per MMC InstanceKVMN/A1 DVS per UCS DomainPort Profile per UCS Domain512Dynamic Ports per port profile4096Dynamic Ports per DVSThis slides summarize the current supporting matrix for various Virtual machine manager. These number are by no means hardware / software limitation but more for QA testing effort with current UCSM release. We understand there is a high demand regarding to cluster communication between UCSM and VMM, for example we would like to have multiple VC instances connect a single UCSM and leverage the same set of port profile across your vmware infrastructure or you would have multiple UCSM connect to a single VC and vmotion across UCSM domain. We support both scenarios today and we have customer actively deploy both configurations in their production environment today. If you have a specific need for cluster communication, please reach out to me after session we could talk more in details on how we achieve and support it.
50 UCS General Baseline #6: Publishing Port Profiles Exporting Port Profiles to these to Hypervisor ManagerPublish Port Profiles to Hypervisors and virtual switches within
52 VMware VM-FEX: Infrastructure Requirements Versions, Licenses, etc.VMware VM-FEX is available B series, integrated and standalone C seriesEach VIC card supports up to 116 Virtual Machine Interface (OS limitation)Enterprise Plus License required (as is for any DVS) on HostStandard License and above is required for vCenterHypervisor features are supported for both emulated and hypervisor bypass modevMotion, HA, DRS, Snapshot and Hot add/remove virtual device, Suspend/ResumeVMDirectPath (Hypervisor Bypass) with VM-FEX is supported with ESXi 5.0+VM-FEX upgrade is supported from ESXi4.x to ESXi5.x with Customized ISO and VMware Update Manger
53 VM-FEX and VMware SR-IOV Comparison VM-FEX is the hypervisor bypass solution with vMotion capabilityVMware SR-IOV is incompatible with hypervisor features including vMotion, HA, DRS …VM-FEX has the highest Virtual Machine interface per hostWith UCSM 2.1 release, each ESXi host support up to 116 VM interfaceWith ESX5.1, SR-IOV supports up to 64 VF with Emulex and 41VF with Intel adapterVM-FEX is available on both UCS blade and rack seversBlade Server, Integrated rack server with UCSM and Standalone rack server with Nexus 5500With ESX5.1, SR-IOV is only available on PCIe adapter and standalone rack serverVM-FEX enable centralized network management and policy enforcementNetwork policy is configured as port profile in UCSM / N5K and push to vCenter as network labelClean separation between Network and Server responsibilityVM-FEX Configuration is fully automated – Easy VM-FEX toolVM-FEX provides inter VMs traffic visibility through network toolbefore we get into the configure workflow, I would like to spend some time and talk about the SR-IOV feature with ESX5.1 release. Just like other hypervisor vendors who implement SR-IOV, the goal is to provide hardware like performance into your virtual machine.Secondly, the whole idea of manage your virtual switch port as physical one and coverage both virtual and physical infrastructure does not exist with standalone SR-IOV. There is no concept of port profile and bridge controller and there is no grantee on policy consistent throughout your infrastructure.There are also some SR-IOV limitation regarding to scale number and support form factor but we expect these to change with future ESX release.However there are some cavets I would like to point out with Vmware SR-IOV. First and foremost, as of today, SR-IOV is incompatible with majority of the hypervisor features including basic function such as vmotion, HA and DRS. The moment you enable SR-IOV, your VM would be isolated on the hypervisor since it would not able to vmotion around. You would have to make a choice between performance or hypervisor features but with VM-FEX you would enjoy both.
54 VMware VM-FEX Configuration Workflow 1. Configure Service Profile and adapter4. Install Cisco VEM softwarebundle and plugin in vCenterVMW ESXServervCenterUCS DVS (PTS)VM#1#4#3#2#5#8#7#6UCS exports Port Profiles to VCUCSadministratorNetworkAdministrator2. Creating Port Profile andCluster in UCSM5. Add ESX host into DVScluster in vCenter3. Configure USCM VMwareIntegration wizard6. Configure Virtual Machinesetting to enable hypervisor bypass7. Verify VM-FEX configurationin both UCSM and vCenterUp to this point, we have touched based on a lot VM-FEX feature and individual steps. Lets connect the dots together and see what is complete stream of workflow looks like. The workflow could be divided into two parts. First is UCSM side of configuration which generally fall under network administration and second VC configuration which is responsible by server team. The entire workflow is a collaboration between network and server team but with define task for each.Throughout these 7 steps of workflow, only step 2 – configure port profile and step 6 attaching port profile are day to day operation for the administrator, the remain steps are only a execute during step up.- Use service profile templatePeople know what is bring up and day to day operation54
57 VMware VM-FEX Best Practice Pre-provision the number of dynamic vNIC for future usageChanging the quantity and the adapter policy require server rebootSelect “High Performance Mode” in Port Profile to enable hypervisor bypassUtilize ESX Native VMXNET3 DriverUser configurable parameter including queue, interrupt, ring size through policyRecommend to have Num (vCPU) = Num (TQ) = Num(RQ) to enable DirectPath I/OOther consideration to deploy VM-FEXESX heap memory size : MTU sizeESX available interrupt vectors : Guest OS and adapter policyDedicated spreadsheet for VM-FEX calculation and sizing
58 Easy VM-FEX Tool VMware solution only today with UCS Quick System BringupsAssumption of 1 management interface per ESX hostOptional vMotion / FT logging also handledAll supported versions of VMware that VM-FEX supportsEnterprise+ or EvaluationCan define some defaults in text fileServer needs Dynamic vNICs on Service Profile (will check)Deployment name limited to 8 characters in toolUCSM respository for ESX kernel model, or separate tool to pull from VMware online to a dedicated directory locally
60 ESX Kernel Pass Through Module Nexus VM-FEX System ViewNexus VM-FEX System ViewUCS VM-FEX System ViewDeploying on a UCS C Series with Nexus 5500 InfrastructureDeploying on a UCS C Series with Nexus 5500 InfrastructureNexus 55xx A (port profiles)Nexus 55xx B (port profiles)Fiber Channel Uplink PortsFiber Channel Uplink Ports78121278MgmtUplink4747veth10veth1veth2veth3veth4vfc0Virtual InterfaceControl Logic4848Virtual InterfaceControl Logicvfc1veth1veth2veth3veth4veth10VN 10GbeMgmtUplinkInternal Connections123456123456Nexus 55xx Physical PortsvPC ConnectionsNexus 55xx Physical Ports2232 Fabric Ports2232 Fabric Ports1281282232 FEX A2232 FEX BVM-FEX12345632123456322232 Server Ports2232 Server PortsCisco AdapterVIC CPU1UCS C Series ServerCIMCKVM etc.vHBA0vNIC1(s)vNIC2(s)vHBA1HBA 0ESX Kernel Pass Through ModuleHBA 1Nexus VM-FEX follows the same fashion as UCS. The minor difference is that Nexus utilize the vPC peer link to synchronize the virtual Ethernet and port profile database across Nexus 5500 cluster rather than the L1/L2 links on fabric interconnect. The vPC domain and peer links are purely for database synchronization and not actively participate in forwarding virtual machine traffic.The virtual interface control logic is now on Nexus 5500 but the mechanism in terms of VN Tage and VIC control protocol works exactly the same as UCS.
61 ESX Kernel Pass Through Module Nexus VM-FEX System ViewUCS VM-FEX System ViewDeploying on a UCS C Series with Nexus 5500 InfrastructureNexus 55xx A (port profiles)Nexus 55xx B (port profiles)Fiber Channel Uplink PortsFiber Channel Uplink Ports785612125678MgmtUplink4747veth10veth1veth2veth3veth4vfc0Virtual InterfaceControl Logic4848Virtual InterfaceControl Logicvfc1veth1veth2veth3veth4veth10VN 10GbeInternal Connections123456123456MgmtUplinkNexus 55xx Physical PortsvPC Connections (veth’s not a vPC at FCS)Nexus 55xx Physical Ports2232 Fabric Ports2232 Fabric Ports1281282232 FEX A2232 FEX BVM-FEX12345632123456322232 Server Ports2232 Server PortsCisco AdapterVIC CPU1UCS C Series ServerCIMCKVM etc.vHBA0vNIC1(s)d-vNIC1d-vNIC2d-vNIC3d-vNIC4vNIC2(s)vHBA1HBA 0ESX Kernel Pass Through ModuleHBA 1KernelServiceConsole
62 Nexus 5500 VM-FEX Demo Topology Nexus 5548/C22 M3/ VIC1225/ESXi 5.1Nexus 5548-A is pre-configured, focus on B with the same configuration in the demoUplink Redundancy – Each static vNIC configure as Active/StandbyNo need for OS teaming softwareRequired for hypervisor uplinkEach dynamic vNIC attach to uplink in Round Robin fashionvPC Doman and Peer Link is configured to synchronize veth numbering for the same VMNot used for the forwarding planevEthvEthvEthvEthvEthvEthNexus 5548vPC Peer LinkFabric ExtendersVIC 1225Port 1Port 2C22 M3vNIC 1CH1vNIC 2CH2dNIC 1dNIC 2dNIC 3ESXi 5.1VM 1vNIC 0VM 2vNIC 0VM 3vNIC 062
63 Nexus 5500 VM-FEX Configuration Workflow 6. Download Extension and register plugin8. Add Sever into DVS cluster9. VM created and attach port profileServeradministratorNexus 55000. Set Up Connection2. Install N5K Feature License3. Configure Static Binding interface andenable VNTag on host interfaces4. Configure Port Profile5. Configure DVS & Extension10. Verify VM-FEX statusNetworkAdministratorCisco Adapter1. VM vNICs provisioned andVNTag Mode enable7. Install VEM on ESXi hostUCS C-series CIMCServeradministratorThe Nexus VM-FEX work flow follow pretty much the same fashion as UCS includingInstead of just between UCSM and VC, it would be three components involved including both Nexus 5500 and CIMC. Since Nexus5500 does not have the capability to manage C series racks server directly, we would connect directly to the server CMIC and provision the dynamic policyThink of CIMC as a strip down version of UCSM for individual rack server, instead of configure-You would need to enable VM-FEX license, the license itself is free and include within the system but VM-FEX feature is no enable by default.- We would also need to enable VNTag capability on the host interface. By defaultStep 1.Enable NIV (Network Interface Virtualization) on the P81E Adapter in the CIMCCIMC is the management Interface for the Cisco C2xx serversChoose the number of dynamics to configure (next slide)Setting a Group of Dynamics on C2xx ServersStep 2.Enable vNICs then view the VM-FEXs Tab in the CIMCUCS Standalone CIMC version 1.4 or greater requiredMinimum of 2 static vNICs definedNumbers of VM FEX’s (dynamic vNICs) are dependent on links from 5500 to 2232 if using FEX (Limit at 96 today)Nexus 5500 version 5.1(3)N1(1) or later requiredStep 3. Configuring a static profile for the fixed interfacesOne to each N5k in a pairSelect the vNIC, and the port profile to assign to itThese will be initially in the vSwitch in the out of box VMware configurationStep 4.Downloading and registration of plug-in per the other VM-FEX topologies63
65 Nexus 5500 VM-FEX Best Practice VM-FEX supports single N5k topology, vPC Topology is recommendedIn vPC Topology, ensure both N5k have the same port-profile configurationIn vPC Topology, need to configure the same SVS connection on both N5k but only Primary switch has active connection to vCenterWhen secondary switch takeover primary role, seamlessly activate the connection to vCenterEnable “vethernet auto-create” featureAutomatically create vEth port for dynamic vNIC during server boot upAuto created vEth are numbered > = 32769DirectPath I/O is active with “high-performance host-netio”cmd in port profile
67 Hyper-V Scale Comparison 4/6/2017 2:13 PMVmware vSphere 5 .1Windows Server 2008 R2Windows Server 2012HW Logical Processor Support160 LPs64 LPs320 LPsPhysical Memory Support2 TB1 TB4 TBCluster Scale32 Nodes up to 4000VMs16 Nodes up to 1000 VMs64 Nodes up to 4000 VMsVirtual Machine Processor SupportUp to 64 VPsUp to 4 VPsVM MemoryUp to 1 TBUp to 64 GBLive MigrationConcurrent vMotion 128 per datastoreYes, one at a timeYes, with no limits. As many as hardware will allow.Live Storage MigrationConcurrent Storage vMotion 8 per datastore, 2 per hostNo. Quick Storage Migration via SCVMMServers in a Cluster321664VP:LP Ratio8:18:1 for Server12:1 for Client (VDI)No limits. As many as hardware will allow.Microsoft Confidential
68 Hyper-V Host Hyper-V Host SR-IOV Overview4/6/2017 2:13 PMHyper-V HostHyper-V HostRoot PartitionVirtual MachineRoot PartitionVirtual MachineHyper-V SwitchHyper-V SwitchVirtual FunctionVirtual NICSwitchingVLAN FilteringData CopySwitchingVLAN FilteringData CopyVMBUSSR-IOV Physical FunctionPhysical NICNetwork I/O path without SRIOVNetwork I/O path with SRIOVIn the traditional network data path, when a packet arrive in the underlying physical adapter, it would then do a memory copy into the Virtual Machine Manager (VMM) in this case, is the hyperV and hyperV vSwitch and based on the destination information such as MAC address or VLAN, the VMM or hypervisor would then do another memory copy and place data packet in the proper virtual machine queue for further process.However with SR-IOV technology, we introduce two components,- Physical Functions (PFs): These are full PCIe devices that include the SR-IOV capabilities. PFs are discovered, managed and configured as normal PCIe devices which is configured as Static VNIC in UCSM- Virtual Functions (VFs): These are light-weight PCIe functions that contain minimal and dedicated resources necessary for data movement, such as queues and interrupts. A set of VFs are managed by a given PF. One or more VFs can be assigned to a Virtual Machine which is configured as dynamic VNIC in UCSM.With paring of PF and VF, now we move the virtual machine queue to the physical SR-IOV capable adapter such Palo or VIC1280 and with Intel Virtualization direct I/O technology, each date packet would now directly transfer or copy from the SRIOV enable physical adapter into the virtual machine adapter. The performance difference comes from eliminating the additional memory copy with Virtual machine manager / hypervisor to achieve higher I/O throughput, high CPU utilization and lower data latency.Microsoft Confidential
69 Hyper-V Extensible Switch Architecture 4/6/2017 2:13 PMHyper-V extensible switch architecture is an open API model that enhance vSwtich featureThree types of extension is defined by Hyper-VCapture ExtensionFiltering Extension (Window Filtering Platform)Forwarding Extension (VM-FEX)Multiple extension is allowedStill need to verify with Vendor for compatibilitySeveral extension is incompatible with WFPExtension state is unique for each vSwitchLeverage SCVMM to centrally configure extensionCisco also provides both PF and VF DriversVirtual MachineVirtual MachineRoot PartitionVM NICHost NICVM NICVF DriverVF DriverHyper-V SwitchExtension MiniportExtension ProtocolCapture ExtensionsFiltering ExtensionsWFP ExtensionsForwarding ExtensionVM-FEX Forwarding ExtensionIn the previous slide we mentions about the VM-FEX forwarding extension. HyperV has introduced several types of switch extension as plugin to vSwitch, comparing to Vmware where we have to install a VEM (virtual Ethernet module) and completely replacing the vSwtich /DVS functionality. The HyperV extension architecture is an open API model that enhance vSwtich feature but not replacing/remove the underlying switch function. It is a more flexible architecture and different type of extension could plug into the same vSwitch.Extension state/configuration is unique to each instance switch, in this case we would need to install VM-FEX forwarding extension on each participate vSwitch. Several extension type includes,Capture extensions can inspect traffic and generate new traffic for report purposes, but cannot modify traffic, could have multiple capture extensions on single vSwitchFiltering Extensions can inspect, drop, modify, and insert packets but can’t direct the packet, for example Windows Filter Platform is an exampleForwarding extensions could direct traffic, defining the destination(s) of each packet, Forwarding extensions can capture and filter traffic but only one forwarding extension per vSwitchExtensible, not replaceable - Added features don’t remove other featuresPluggable switch - Extensions process all network traffic, including VM-to-VM1st class citizen of system - Live Migration and offloads just work; Extensions work togetherOpen & public API model - Large ecosystem of extensionsLogo certification and rich OS framework - High quality extensionsPhysical NICPF DriverMicrosoft Confidential
70 SCVMM Management of Switch Extensions UCSVirtualizationVF DriverRoot PartitionSCVMM ServerCapture ExtensionVMMServiceVMM AgentFiltering ExtensionUCSM Provider PluginForwarding ExtensionVM-FEX Forwarding ExtensionUCS Manager APIPhysical NICUCS VICPF Driver- Slide is too busyInstall Cisco Provider DLL on SCVMM HostInstall Provider DLLSetup Registry KeysCreate VSEM (Virtual Switch Extension Manager)Assign one or more Host GroupsSpecify a connect-string : exampleUpon successful connection, it will fetch FND, FN, UP-Link PP, VPP, IP-Pools etc. from UCS (Apache/LUA , NSM component).Create Logical SwitchAssociate to a VSECreate Native VPP (alias to VPP)Create Native UPP (alias to UPP)Service ProfileNetwork and Security Policy database
71 Hyper-V VM-FEX: Infrastructure Requirements VM-FEX is available for Hyper-V on UCS B series and Integrated C seriesStandalone C series support is on the road mapEach VIC card supports up to 116 Virtual Machine InterfaceInstall additional VIC will double VM interface (B200M3 with 2 VIC -> 232 per host)Windows Server 2012 is required for both Host and Guest OS (same level of Kernel)Do Not Support Windows Server Core and Hyper-V standalone serverVM-FEX with Live Migration fully supportedVarious options for share storage – Failover Cluster, SMB, share nothing storageFull PowerShell library support for automationPowerShell Commandlet for UCSM, Hyper-V and SCVMM
72 Hyper-V VM-FEX: Infrastructure Requirements Support Two Management ApproachesMicrosoft Management Console (MMS) for Standalone Hyper-V deploymentSystem Center Virtual Machine Manager (SCVMM SP1) for Integrated Hyper-V deploymentUCS Manager full Integration with Systems Center Virtual Machine Manager (SCVMM) SP1Expect to release with UCSM 2.2UCS Provider Plugin includes VM-FEX switch forwarding extensionSCVMM use UCSM Provide Plugin to pull information from UCSMCisco provides both VIC drivers and VM-FEX switch forwarding extensionVIC Utility Tool is provided (MSI)The same Windows Driver for both Physical Function (Host) andVirtual Function (Guest)
73 Hyper-V VM-FEX: SCVMM Network Definition HOST GROUP: SJHOST GROUP: NYCFabric Network (FN) – A network abstraction representing a logical network composed of network segments (VLANs) spanning across multiple sitesFabric Network Definition (FND) – A network abstraction composed of site-specific network segmentsVM Network Definition (VMND) – A sub-network abstraction composed of a single network segment (and an IP pool) at a specific siteVM Network (VMN) – A sub-network abstraction composed of network segments spanning across multiple sites. Used by a tenant’s VMUplink Port-Profile (UPP) – Carries a list of allowed FNDs for a pNICVirtual Port Profile (VPP) – Profile defining QoS/SLA characteristics for a vNICLogical Switch – Microsoft’s native DVS and define Live Migration BoundaryVM 1VM 2VM 4VM 5WEBGold-VPPWEB,Silver--VPPLogical Switch (DVS)vSwit chvSwit chFN: PUBLICLive Migration BoundaryVMN: WEBUplink PP-SJCGold-VPPSilver-PPBronze-PPUplink PP-NYCFND: PUBLIC-SJCFND: PUBLIC-NYCVMND: WEB, VLAN: 55VMND: WEB, VLAN: 155- Fabric Network is a network abstraction representing a logical network composed of network segment across multiple sites.-Fabric Network Definition is a network abstraction composed of site-specific networking such as VLAN and subnet information.-VMND – is the minimum network segment defined by SCVMM, each virtual machine interface would get assigned to a VMND with specific subnet and IP information. A group of VMND becomes a Fabric Network Definition and a group of FND becomes fabric network.-The uplink port profile is one of the two port profile assigned by SCVMM to specify the allowed Fabric Network Definition or Site for individual physical NICThe virtual port profile defines the QoS and SLA information for the individual VM nic.Lastly, the logical switch which define the VM Live Migration boundary and the available switch extension for each Hyper-V host.VM Network: Span the VMND segment across multiple sites (FND)
74 Hyper-V VM-FEX Configuration Workflow 3. Install UCSM Provider Plugin4. Configure SCVMM SwitchExtension Manager1. Configure Service Profile2. Setup SCVMM andCreate Port Profile5. Configure SCVMM Logical Switch6. Associate Native VM Network toExternal Provided VM Network8. Verify VM-FEXConnectivity in UCSMThe workflow for hyper-V VM-FEX is similar what we have been seen on VMware.Two parties are involved, network administrator majority would perform the task directly on FI and Server administrator who would be focusing on SCVMM configuration.The first step, we would configure service profile with appropriate number of dynamic vnic. Next the network admin would layout the infrastructure based on the SCVMM architecture mentioned in the previous slides including detailed information of FN, FN site, VMND (minimum subnet with IP pool), two types of port profile (uplink port profile and virtual port profile) and mostly switch extension. Then we will go sever admin and install the UCSM Provide Plugin DLL on SCVMM server and enable the communication between SCVMM and UCSM.7. Assign Hyper-V Host toLogical Switch and attach portClassification
77 Hyper-V VM-FEX Best Practice Always utilize SCVMM to configure Hyper-V and Virtual Machine propertyUtilize NTTTCP as performance benchmark tool in Windows PlatformNTTTCP is Microsoft internal testing toolLatest version available – NTTTCP v5.28 with Windows Server 2012 (April 2013)Optimized for 10GE interfaceEnable Receive Side Scaling (RSS)Use Powershell Command - Set-VMNetworkAdapter –VMName “Server” – IovQueuePairsRequested 4Need to shutdown VM to apply RSSiSCSI boot is NOT support for PF as an overlay iSCSI vNIC
79 RHEL KVM VM-FEX: Infrastructure Requirements VM-FEX is available for Hyper-V on UCS B series and Integrated C seriesStandalone C series support is on the road mapEach VIC card supports up to 116 Virtual Machine InterfaceInstall additional VIC will double VM interface (B200M3 with 2 VIC -> 232 per host)UCS Manager 2.1 release is required to supported SR-IOV in KVMInstall Red Hat as Virtualization HostRHEL 6.2 for VM-FEX emulation mode (SR-IOV with MacVTap)RHEL 6.3 for VM-FEX hypervisor bypass mode (SR-IOV with PCI passthrough)MacVTap Direct (Private) mode is no longer supported with UCSM release 2.1Live migration feature only supported in emulation modeGuest Operating System RHEL 6.3 Required to support SR-IOV with PCI passthroughRHEL 6.3 inbox driver supports SR-IOV with PCI passthroughScripted nature of configuration at FCSNo current RHEV-M for RHEL KVM 6.xVirtual Machine interface management via editing of VM domain XML file
80 VM-FEX with KVM Architecture Guest 1Guest 2Libvirt – Open source management tool is used for managing virtual machines provides a generic framework supports for a various virtualizationA Virtual Machine in Libvirt is represented as a domain XML file and store under QEMU user spaceVirsh is GUI interface built on top of Libvirt APIMacvTap - Linux Macvtap driver provides a mechanism to connect a VM interface directly to a host physical deviceLibvirt uses macvtap to provide direct attach of VM NIC to host physical deviceVM-FEX bypass MacvTap InterfaceKVMApplicationApplicationVirshPort Profile1: Qos1, vlan1GuestOSGuestOSUserLibvirtPort Profile2: Qos2, vlan2virtio-netvirtio-netNetlinkSocketvhost-netvhost-netMacvtap InterfaceKernelMacvtap 1Macvtap 2……Netdev Interfaceeth0eth1eth2ethn……PFVF1VF2VFnCisco VICAdapter PortSwitch Port- KVM uses x86 hardware virtualization extensions (Intel VT or AMD-V)- KVM kernel driver manages the virtualization hardware- KVM uses QEMU user-space for device emulation, a Virtual machine is simply a process. Hence all process management commands like kill etc work on VM’sLibvirt Libvirt an open source management tool is used for managing virtual machines. Libvirt provides a generic framework and supports management of a various virtualizationtechnologies like KVM, Xen, Vmware ESX and others. It runs as a service named libvirtd- A virtual machine in libvirt is represented in the form of a domain xml fileVirsh tool built on top of the libvirt API provides an interface to manage virtual machines (GUI interface of Libvirt)MacVTap- The Linux macvtap driver provides a mechanism to connect a virtual machine tap interface directly to a host physical deviceLibvirt uses macvtap to provide direct attach of virtual machine NIC to host physical deviceGuest 1 is configured with SR-IOV macvtap where all the VM traffic is passing through vhost-net and macvtap in user space, VM migration is supported in this modeGuest 2 is configured with SR-IOV PIC pass through which VM traffic is directly copy from VM enic to the VIC hardware and achieve better performance by bypass macvtapHowever VM migration is not supported in this mdoe.Veth 1Veth 2Port Profile1: Qos1, vlan1UCS SwitchPort Profile2: Qos2, vlan2
81 VM-FEX on KVM Configuration Steps Upgrade UCSM to 2.1+ firmware (Del Mar release support SR-IOV)Configure Service Profile with static (PF) and dynamic (VF) adapter policyCreating Port Profile and Port Profile Client (only single default DVS support)Install VM OS with RHEL 6.3 to support SR-IOVModify Virtual Machine Domain XML file to enable VM-FEX functionConnect VM to Virtual Machine Manager (GUI interface of Virsh)Verify VM-FEX configuration in both UCSM and RHEL host
83 VM-FEX Customer Benefit SimplicityOne infrastructure for virtual & physical resource provisioning, management, monitoring and troubleshootingConsistent features, performance and management for virtual & physical infrastructurePerformanceVMDirectPath and SR-IOV enabled near bare metal I/O performanceLine rate traffic to the virtual machineRobustnessTrouble shooting & traffic engineering VM traffic holistically from the physical networkProgrammability, ability to re-number VLANs without disruptive changesFEX technologies can reduce managed device countFEX technologies will greatly reduce cabling overheadVM-FEX is terminating these virtual links directly on the VMsClosely maps to the physical server model for operations and managementMultiple Hypervisors are supported with advanced featuresBandwidth can be engineered identically to physical infrastructures todayLatency can surpass local virtualized switching by moving away from virtual switching store and forward buffering, “tree’s” of ASIC traversals, to a uniform port controller and switch fabric model
84 UCS Advantage Videos on YouTube BRKCOM-2005Recommended ViewingUCS Advantage Videos on YouTubePlaylistUCS Technical VideosOverviewCisco UCS Advantage
85 BRKCOM-2005 Recommended Viewing Category Title URL UCS server Service Profiles and TemplatesOrganizations and RolesExtended Memory TechnologyServer Pre-ProvisioningBIOS PoliciesRAID PoliciesFirmware PoliciesServer Pools and Qualification PoliciesMaintenance PoliciesHigh Availability During UpgradesMonitoring with BMC BPPMMicrosoft Hyper-V on UCS
86 BRKCOM-2005 Recommended Viewing Category Title URL UCS I/O Adapter TemplatesNetwork Interface VirtualizationAdapter Fabric FailoverExtend the Network to the Virtual MachineTraffic Analysis of All ServersEthernet Switching ModesFibre Channel and Switch ModesFC Port Channels and Trunking
87 BRKCOM-2005 Recommended Viewing Category Title URL UCS Infrastructure Lights-Out ManagementEasy VM-FEX DeploymentServer Power GroupingBlade and Rack-Mount ManagementManager Platform EmulatorCisco Developer Network and Sandbox