虛擬化技術 Virtualization Techniques

Slides:



Advertisements
Similar presentations
Communication Networks Recitation 3 Bridges & Spanning trees.
Advertisements

Transitioning to IPv6 April 15,2005 Presented By: Richard Moore PBS Enterprise Technology.
CMPE 150- Introduction to Computer Networks 1 CMPE 150 Fall 2005 Lecture 19 Introduction to Computer Networks.
CSCI 465 D ata Communications and Networks Lecture 20 Martin van Bommel CSCI 465 Data Communications & Networks 1.
Implementing PCI I/O Virtualization Standards
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) SriramGopinath( )
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
1 Chapter 8 Local Area Networks - Internetworking.
ATM COMPONENTS Presented by: ANG BEE KEEWET CHONG SIT MEIWET LAI YIN LENGWET LEE SEANG LEIWET
1 Chapter 8 Local Area Networks - Internetworking Data Communications and Computer Networks: A Business User’s Approach.
1 K. Salah Module 4.3: Repeaters, Bridges, & Switches Repeater Hub NIC Bridges Switches VLANs GbE.
Windows Server Scalability And Virtualized I/O Fabric For Blade Server
 The Open Systems Interconnection model (OSI model) is a product of the Open Systems Interconnection effort at the International Organization for Standardization.
Gursharan Singh Tatla Transport Layer 16-May
(part 3).  Switches, also known as switching hubs, have become an increasingly important part of our networking today, because when working with hubs,
PCIe 2.0 Base Specification Protocol And Software Overview
虛擬化技術 Virtualization Techniques
I/O Virtualization And Sharing PCI-SIG IO Virtualization Michael Krause (HP, co-chair) Renato Recio (IBM, co-chair) Michael Krause (HP, co-chair) Renato.
Virtual LANs. VLAN introduction VLANs logically segment switched networks based on the functions, project teams, or applications of the organization regardless.
Connecting LANs, Backbone Networks, and Virtual LANs
Network Management Concepts and Practice Author: J. Richard Burke Presentation by Shu-Ping Lin.
虛擬化技術 Virtualization Techniques
VLAN Trunking Protocol (VTP) W.lilakiatsakun. VLAN Management Challenge (1) It is not difficult to add new VLAN for a small network.
Mahesh Wagh Intel Corporation Member, PCIe Protocol Workgroup.
LAN Overview (part 2) CSE 3213 Fall April 2017.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 3: Implementing VLAN Security Routing And Switching.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 4: Frame Relay Connecting Networks.
The University of New Hampshire InterOperability Laboratory Introduction To PCIe Express © 2011 University of New Hampshire.
Section 4 : The OSI Network Layer CSIS 479R Fall 1999 “Network +” George D. Hickman, CNI, CNE.
VLAN Trunking Protocol (VTP)
Topics covered: Memory subsystem CSE243: Introduction to Computer Architecture and Hardware/Software Interface.
Protocols and the TCP/IP Suite
Basic LAN techniques IN common with all other computer based systems networks require both HARDWARE and SOFTWARE to function. Networks are often explained.
Traffic Management - OpenFlow Switch on the NetFPGA platform Chun-Jen Chung( ) Sriram Gopinath( )
SpaceWire Plug and Play Glenn Rakow – NASA-GSFC, Greenbelt, MD Pat McGuirk – Micro-RDC, Albuquerque, NM Cliff Kimmery – Honeywell Inc., Clearwater FL Paul.
TELE202 Lecture 5 Packet switching in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lectures »C programming »Source: ¥This Lecture »Packet switching in Wide.
Univ. of TehranAdv. topics in Computer Network1 Advanced topics in Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
© 2002, Cisco Systems, Inc. All rights reserved..
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 3: Implementing VLAN Security Routing And Switching.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 9 Virtual Trunking Protocol.
1 Data Link Layer Lecture 23 Imran Ahmed University of Management & Technology.
SECURING SELF-VIRTUALIZING ETHERNET DEVICES IGOR SMOLYAR, MULI BEN-YEHUDA, AND DAN TSAFRIR PRESENTED BY LUREN WANG.
Intel Open Source Technology Center Lu Baolu 2015/09
Chapter 4 Version 1 Virtual LANs. Introduction By default, switches forward broadcasts, this means that all segments connected to a switch are in one.
W&L Page 1 CCNA CCNA Training 2.5 Describe how VLANs create logically separate networks and the need for routing between them Jose Luis.
VLAN Trunking Protocol
Internet Protocol Storage Area Networks (IP SAN)
VLAN Trunking Protocol (VTP)
+ Routing Concepts 1 st semester Objectives  Describe the primary functions and features of a router.  Explain how routers use information.
12006 MAPLD International ConferenceSpaceWire 101 Seminar SpaceWire Plug and Play (PnP) 2006 MAPLD International Conference Washington, D.C. September.
+ Lecture#2: Ethernet Asma ALOsaimi. + Objectives In this chapter, you will learn to: Describe the operation of the Ethernet sublayers. Identify the major.
Exploration 3 Chapter 4. What is VTP? VTP allows a network manager to configure a switch so that it will propagate VLAN configurations to other switches.
InterVLAN Routing 1. InterVLAN Routing 2. Multilayer Switching.
Lecture 15: IO Virtualization
Introduction to Networks v6.0
Testing PCI Express Generation 1 & 2 with the RTO Oscilloscope
Memory Hierarchy Ideal memory is fast, large, and inexpensive
Instructor Materials Chapter 5: Ethernet
Chapter 4 Data Link Layer Switching
Virtual LANs.
SCSI over PCI Express (SOP) use cases
Chapter 4: Switched Networks
Chapter 3: Implementing VLAN Security
Chapter 2: Scaling VLANs
Chapter 13: I/O Systems.
Presentation transcript:

虛擬化技術 Virtualization Techniques Hardware Support Virtualization MR-IOV

MR-IOV Introduction Multiple servers & VMs sharing one I/O adapter Bandwidth of the I/O adapter is shared among the servers The I/O adapter is placed into a separate chassis Bus extender cards are placed into the servers

MR-IOV Topology MR components group to create Virtual Hierarchies (VH) Virtual Hierarchy = a logical PCIe hierarchy within a MR topology. Each VH typically contains at least one PCIe Switch. Extends from a RP to all its EPs Each VH may contain any mix of Multi-Root Aware (MRA) Devices, SR-IOV Devices, Non-IOV Devices, or PCIe to PCI/PCI-X Bridges. The MR-IOV topology typically contains at least one MRA Switch

MR-IOV Topology MRA Switch MRA Switch PCIe Switch PCIe to PCI Bridge Root Complex (RC) Root Complex (RC) Root Complex (RC) Root Complex (RC) Root Port (RP) Root Port (RP) Root Port (RP) Root Port (RP) MRA Switch MRA Switch PCIe Switch PCIe to PCI Bridge MRA PCIe Device SR-IOV PCIe Device PCIe Device PCI/PCI-X Device

Topology Overview and Terms SR Topology Multi-Root Topology Terms Single Root (SR) IOV Overview, Only has one Root. Switches only need to support PCIe base functionality. To make full use of IOV, EP must support SR-IOV capabilities. SR-PCIM configures the EP. Multi-Root (MR) IOV Overview, One or more Roots. Switches with Multi-Root Aware (MRA) functionality are needed. To make full use of IOV, EP must support SR & MR-IOV capabilities. MR-PCIM assigns Virtual Endpoints (VEs) to RCs and manages PCIe components. SR-PCIM configures its VEs.

Multi-Root IOV function Types and Terms MR Topology MR Topology Terms Virtual Endpoint (VE) is the set of physical and virtual functions assigned to an RC. Each VE is assigned to a Virtual Hierarchy (VH). Virtual Hierarchy (VH) is a fully functional PCIe hierarchy that is assigned to an RC or MR-PCIM. Note, all PFs and VFs in a VE are assigned the same VH. Base Function (BF) only 1 per EP and is used by MR-PCIM to manage an MR aware EP (e.g. assigning functions to Virtual Endpoints).

MRA Components Multi-Root Aware Root Port (MRA RP) Multi-Root Aware Device (MRA Device) Multi-Root PCI Manager (MR-PCIM) Multi-Root Aware Switch (MRA Switch)

MRA Components Multi-Root Aware Root Port (MRA RP) Maintain state to delineate each VH. At a high level, this amounts to a set of resource mapping tables to translate the I/O function associated with each SI into a VH and MR I/O function identifier. Participate in the MR transaction encapsulation protocol to enable an MRA Switch to derive the VH and associated routing information. Emit an MR Link. An MR Link is identical to the physical layer of a PCIe Link as defined in the PCI Express Base Specification.

MRA Components Multi-Root Aware Root Port (MRA RP) Implement MRA congestion management. It does not forward TLPs for VHs where it is not the root. If this functionality is required, the MRA Root Complex may contain an embedded MRA Switch. A Translation Agent (TA) parses the contents of a PCIe DMA request transaction (TLP) PCIe RP and MRA RP and MRA RP Functional Block Comparison

MRA Components Multi-Root Aware Device(MRA Device) Must support the new MR-IOV DLLP protocol. Non-IOV Devices and SR-IOV Devices do not support the MR-IOV capability and therefore are unable to participate in this protocol. An MRA Switch must assume all responsibility for forwarding transactions and event handling on behalf of these devices through the MR-IOV topology. The MRA Switch performs all encapsulation or de-encapsulation as appropriate.

MRA Components Multi-Root Aware Device(MRA Device) Must support the MR-IOV transaction encapsulation protocol. The MR-IOV encapsulation protocol provides VH identification information to the MRA Switch to enable the transaction to be transparently forwarded through the MR-IOV topology without requiring modification to the PCI Express Base Specification TLP protocol or contents.

MRA Components Multi-Root Aware Device(MRA Device) It is composed of a set of Functions in each VH. There are a variety of Function types: BF PF VF Non-IOV Function

MRA Components A BF is a function compliant with this specification that includes the MR-IOV Capability. A BF shall not contain an SR-IOV Capability. A PF is a Function compliant with the PCI Express Base Specification that includes the SR-IOV Extended Capability. Every PF is associated with a BF. The Function Offset fields in a BF’s Function Table point to the PFs.

MRA Components A VF is a Function associated with a PF and is described in the Single-Root I/O Virtualization and Sharing Specification. VFs are associated with a PF and are thus indirectly as asociated with a BF. A Non-IOV Function is a Function that is not a BF, PF, or VF. Non-IOV Functions may or may not be associated with a BF.

MRA Components Non-IOV, SR-IOV, and MRA Device Functional Block Comparison

MRA Components Multi-Root PCI Manager (MR-PCIM) Each MRA component must support a corresponding MR-IOV capability. This capability is accessed and configured by the Multi Root PCI Manager (MR-PCIM). MR-PCIM can be implemented anywhere within the MR-IOV topology. Alternatively, MR-PCIM can manage the MR-IOV topology through a private interface provided by an MRA Switch.

MRA Components MRA PCIM in an MR-IOV Topology

MRA Components Multi-Root Aware Switch (MRA Switch) In contrast to a PCIe Switch, an MRA Switch is as follows: An MRA Switch is composed of zero or more upstream Ports attached to a PCIe RP or an MRA RP or the downstream Port of an Switch. An MRA Switch is composed of zero or more downstream Ports attached to PCIe Devices, MRA Devices, PCIe Switch upstream Ports, or PCIe to PCI/PCI-X Bridges. An MRA Switch is composed of zero or more bidirectional Ports attached to other MRA Switches or MRA RPs. A set of logical P2P bridges that constitute a VH. Each VH represents a separate address space.

MRA Components Virtual Switch per VH VP protocol support PCIe Type1 CFG space per VH Virtual Hot plug control, Power management tracking, Isolation, Error containment & processing per VH VP protocol support Insert/Remove VP label & Process Reset DLLP PCIM interface SW registers model for MR PCIM control of shared resources

MRA Endpoint VP protocol support Virtual Endpoint(s) per VH PHY PHY VP protocol support Insert/Remove VP label at DLL Process DLLP for VP RESET Virtual Endpoint(s) per VH Type 0 CFG space per VH IOV capabilities within VH for SR IOV support PCIM interface Registers for MR PCIM control Included in IOV CFG space of PF DLL DLL PF0 PF0 PF1 PF2 Device Specific Functionality Device Specific Functionality PCIe 1.X Endpoint MRA Endpoint

Base-SR-MR EP Progression MR Endpoints function as SR or Base Endpoints MR capabilities unused by non-MRA SW ALL SR capabilities included in MR device SR Endpoints function as Base Endpoints IOV capability register ignored by PCIe Base SW Goal is to minimize cost of new functions Minimal CFG space replication Minimal logic impact of new functions

Virtual Plane Protocol TLP Tag Inserted/removed at DLL TLP is unmodified TLP processing uses PCIe 1.x rules Targeted for support of 256 VP Room for expansion to more VP or more functions Devices may support from 1 to 256 VP RESET DLLP Per plane RESET DLLP Guaranteed progress under all topology conditions TLP messages can stall due to FC congesion Propagates using RESET logical rules Provides a mirror of PCIe RESET within a plane

Tagging TLPs Header for all TLPs on MR link Not included on PCIe 1.X links Header included on all TLPs Stable during retransmissions Located between Sequence # and TLP Hdr ACK/Sequence # concept remains per link Not affected by Virtual Plane changes Like VC today Header covered by LCRC VP# bits are not part of ECRC

RESET DLLP Provides RESET assert/clear for up to 16 VP in single DLLP RESET function remains a level event RESET state is stored and maintained by each link partner Handshake for complete reliability of RESET propagation Guarantees buffer flush of VH within intermediate switches Utilizes currently RSVD DLLP

Virtual Plane Operation Initialization & Enumeration MR PCIM discovers MR, SR, and Base devices MR PCIM devices and PFs to RPs MR PCIM programs MR switch and EP tables with MR assignments SR SW enumerates within its Virtual Hierarchies Traffic Flow Base RP & Base/SR EP utilize PCIe Base protocol MR EP inserts VP tag at DLL for appropriate PF on Base TLP Switch utilizes VP tag to index correct Type 1 CFG headers PCIe Base routing rules utilized within a Virtual Hierarchy RESET Base RP asserts RESET as TS1 or Fundamental RESET Switch propagates on MR link as RESET DLLP within a VP Switch propagates on Base link as TS1 MR Switch or EP receiving RESET DLLP must flush according to PCIe rules

Protocol Selection MRA protocol usage must be enabled per link Base, SR, and MR devices all coexist Base devices must work unchanged New protocol should either be ignored or Enabled only after both link partners are determined as capable Two options under consideration Auto-negotiation at DLL Does not modify the PHY layer training SW enable

Protocol Selection Auto negotiation modifies the DLL training SM New DLLPs used during DLL training to indicate capabilities of link partners New DLLPs dropped by base devices during training Once dropped, link reverts to PCIe Base operation SW enable controlled by MR PCIM MR PCIM uses PCIe Base protocol for initial discovery MRA enabled during discovery and initialization by MR PCIM

MR PCIM initialization

MR PCIM Assignment

SR PCIM Assignment

Non-Posted (NP) Request from PCIe Base to PCIe Base

NP from PCIe Base to MRA

Posted from MRA to PCIe Base

RESET Propagation RP Reset completely resets VH Equivalent to PCIe 1.X RESET functionality

Hot Plug in MR MRA Switches have two Hot Plug Controllers (HPC) Physical controller (1 per link) Controls events on the link Owned by MR PCIM Virtual controller (1 per VP per link) Controls events within a VH Owned by SR PCIM, coordinated with MR PCIM New registers added for MR PCIM access point Hot Plug in MR consists of two event types Physical events coordinated by MR PCIM and physical HPC (pHPC) Virtual events coordinated by MR PCIM, SR PCIM, and virtual HPC (vHPC) HPC SW interface same a PCIe Base Utilizes PCIe Hot Plug specification within switches

MR Hot Plug Interfaces

Hot Plug Event Steps User event initiated Example: Slot Attention Button Press for device removal pHPC notifies MR PCIM of event Uses MR PCIM pHPC interface (#3) MR PCIM initiated virtual HP events through vHPC Uses MR PCIM vHPC interface (#2) vHPC notifies SR PCIM(s) of event Uses SR PCIM vHPC interface (#1)

Hot Plug Event Steps SR PCIM processes event and updates state of vHPC Uses SR PCIM vPHC interface (#1) vHPC notifies MR PCIM of state change for that vHPC Uses MR PCIM vHPC interface (#2) MR PCIM processes all state changes and updates state of pHPC Uses MR PCIM pHPC interface (#3) pHPC indicates device is safe for removal Example: Slot power removed & LED state change

Hot Removal Event Initiated

Hot Removal Coordination

Hot Removal Completion

Reference Multi-Root I/O Virtualization and Sharing Specification Revision 1.0 Dennis Martin, “Innovations in storage networking: Next-gen storage networks for next-gen data centers,” in Storage Decisions Chincago presentation titled, 2012. http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=e3da4046eb5314826343d9df18b60f083880bf7b http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=ee6c699074c0b2440bfac3abdecb74b3d89821a8 http://www.pcisig.com/developers/main/training_materials/get_document?doc_id=656dc1d4f27b8fdca34f583bdc9437627bc3249f