Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand: Today and Tomorrow Jamie Riotto Sr. Director.

Similar presentations


Presentation on theme: "1 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand: Today and Tomorrow Jamie Riotto Sr. Director."— Presentation transcript:

1 1 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand: Today and Tomorrow Jamie Riotto Sr. Director of Engineering Cisco Systems (formerly Topspin Communications) jriotto@cisco.com

2 2 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Agenda InfiniBand Today – State of the market – Cisco and InfiniBand – InfiniBand products available now – Open source initiatives InfiniBand Tomorrow – Scaling InfiniBand – Future Issues Q&A

3 3 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Maturity Milestones High adoption rates – Currently shipping > 10,000 IB ports / Qtr Cisco acquisition will drive broader market adoption End-to-end price points of <$1000. New Cluster scalability proof-points – 1000 to 4000 nodes

4 4 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Cisco Adopts InfiniBand Cisco acquired Topspin on May 16, 2005 Adds InfiniBand to Switching Portfolio – Network Switches, Storage Switches, now Server Switches – Creates independent Business Unit to promote InfiniBand & Server Virtualization New Product line of Server Fabric Switches (SFS) – SFS 7000 Series InfiniBand Server Switches – SFS 3000 Series Multifabric Server Switches

5 5 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Network Switch Clients Network Resources (Internet, Printer, Server) Storage Switch Server Storage (SAN) Server Switch Servers StorageNetwork Cisco and InfiniBand The Server Fabric Switch

6 6 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Cisco HPC Case Studies

7 7 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Real Deployments Today: Wall Street Bank with 512 Node Grid SAN LAN 2 96-port TS-270 23 24-port TS-120 512 Server Nodes 2 TS-360 w/ Ethernet and Fibre Channel Gateways Core Fabric Edge Fabric GRID I/O Existing Networks Fibre Channel and GigE connectivity built seamlessly into the cluster

8 8 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public 520 Dual CPU Nodes 1,040 CPUs NCSA National Center for Supercomputing Applications Tungsten 2: 520 Node Supercomputer Core Fabric Edge Fabric 6 72-port TS270 29 24-port TS120 174 uplink cables 512 1m cables 18 Compute Nodes  Parallel MPI codes for commercial clients  Point to point 5.2us MPI latency Deployed: November 2004

9 9 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public D.E. Shaw Bio-Informatics: 1,066 Node Super Computer Fault Tolerant Core Fabric Edge Fabric 12 96-port TS-270 89 24-port TS-120 1,068 5m/7m/10m/15m uplink cables 1,066 1m cables 12 Compute Nodes 1,066 Fully Non-Blocking Fault Tolerant IB Cluster

10 10 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Large Government Lab Worlds Largest Commodity Server Cluster – 4096 nodes Application: High Performance Super Computing Cluster Environment: 4096 Dell Servers 50% Blocking Ratio 8 TS-740s 256 TS-120s Benefits: Compelling Price/Performance Largest Cluster Ever Built (by approx. 2X) Expected to be 2nd Largest Supercomputer in the world by node count Core Fabric 8x SFS TS740 288 ports each Edge 256x TS120 24-ports each 18 Compute Nodes) 8192 Processor 60TFlop SuperCluster 2048 uplinks (7m/10m/15m/20m)

11 11 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Products Available Today

12 12 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Switches and HCAs Fully non-blocking switch building blocks available in sizes from 24 up to 288 ports. Blade servers offer integrated switches and pass-through modules HCAs available in PCI-X and PCI-Express IP & Fibre-Channel Gateway Modules

13 13 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Integrated InfiniBand for Blade Servers Create “wire-once” fabric Integrated 10Gbps InfiniBand switches provide unified “wire- once” fabric Optimize density, cooling, space, and cable management. Option of integrated InfiniBand switch (ex: IBM BC) or pass- thru module (ex: Dell 1855) Virtual I/O provides shared Ethernet and Fibre Channel ports across blades and racks IB Switch 10Gbps30Gbps Blade Chassis with InfiniBand Switches HCA

14 14 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Ethernet and Fibre Channel Gateways Unified “wire-once” fabric SAN Server Fabric LAN/WAN Server Cluster Fibre Channel to InfiniBand gateway for storage access Ethernet to InfiniBand gateway for LAN access Single InfiniBand link for: - Storage - Network Single InfiniBand link for: - Storage - Network

15 15 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Price / Performance InfiniBand PCI-Express 10GigEGigEMyrinet DMyrinet E Data Bandwidth (Large Messages) 950MB/s900MB/s100MB/s245MB/s495MB/s MPI Latency (Small Messages) 5us50us 6.5us5.7us HCA Cost (Street Price) $550$2K-$5KFree$535$880 Switch Port$250$2K-$6K$100-$300$400 Cable Cost (3m Street Price) $100 $25$175 Myrinet pricing data from Myricom Web Site (Dec 2004) ** InfiniBand pricing data based on Topspin avg. sales price (Dec 2004) *** Myrinet, GigE, and IB performance data from June 2004 OSU study Note: MPI Processor to Processor latency – switch latency is less

16 16 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Cabling CX4 Copper (15m) Flexible 30-Gauge Copper (3m) Fiber Optics up to 150m

17 17 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Host Drivers for Standard Protocols Open source strategy = reliability at low cost IPoIB: legacy TCP/IP applications SDP: reliable socket connections (optional RDMA) MPI: leading edge HPCC applications (RDMA) SRP: block storage access (RDMA) uDAPL: User level RDMA

18 18 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public OS Support Operating Systems Available: – Linux (Red Hat, SuSE, Fedora, Debian, etc.) – Windows 2000 and 2003 – HP-UX (Via HP) – Solaris (Via Sun)

19 19 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public The InfiniBand Driver Architecture BSD SocketsFS API TCP SDP IP Drivers VERBS ETHERINFINIBAND HCA DAT FILE SYSTEM SCSI SRP FC FCP SDP INFINIBANDSAN API BSD Sockets NFS-RDMA LAN/WANSERVER FABRIC SAN INFINIBAND SWITCH ETHER SWITCH FC SWITCH FC GW E ETH GW NETWORK APPLICATION UDAPL TS IPoIB User Kernel

20 20 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Open Software Initiatives OpenIB.org – Topspin primary authors of major portions including IPoIB, SDP, SRP and TS-API. Cisco will continue to invest. – Current protocol development nearing production quality code. Expect release by end of year. – Charter has been expanded to include Windows and iWarp – MPI will be available in the near future (MVAPICH 0.96) OpenSM OpenMPI

21 21 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand Tomorrow

22 22 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Looking into the future Cost Speed Distance Limitations Cable Management Scalability IB and Ethernet

23 23 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Speed: InfiniBand DDR / QDR, 4X / 12X DDR Available end of 2005 Doubles wire speeds to ? (ok, still working on this one) PCI-Express DDR Distances of 5-10m using copper Distances of 100m using fiber QDR Available WHEN? 12X (30 Gb/s) available for over one year!! – Not interesting until 12X HCA Not interesting until > 16X PCIe

24 24 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Future InfiniBand Cables InfiniBand over CAT5 / CAT6 / CAT7 Shielded cable distances up to ??? Leverage existing 10-GigE cabling 10-GigE too expensive?

25 25 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public IB Distance Scaling IB Short Haul – New Copper drivers –25 – 50 Meters (KeyEye) –75 - 100 Meters (IEEE 10Ge) IB Wan – Same Subnet over distance (300 KM target) – Buffer / Credit / Timeout issues – Applications: Disaster Recover, Data Mirroring IB Long Haul – IB over IP (over SONET?) – utilizes existing public plant (WDM, Debugging, etc)

26 26 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Scaling InfiniBand Subnet Management Host-side Drivers MPI IPoIB SRP Memory Utilization

27 27 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public IB Subnet Manager Subnets are getting bigger – 4,000 -> 10,000 nodes – Topology convergence times Topology disturbance times Topology disturbance minimization

28 28 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Subnet Management Challenges Cluster Cold Start times –Template Routing – Persistent Routing Cluster Topology Change Management – Intentional Change - Maintenance – Unintentional Change – Dealing with Faults How to impact minimum number of connections Predetermine fault reaction strategy? Topology Diagnostic Tools – Link/Route Verification – Built-in BERT testing Partition Management

29 29 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Multiple Routing Models Minimum Latency Routing: – Load-Balanced Shortest-Path Routing Minimum Contention Routing: – Lowest-Interference Divergent-Path Routing Template Driven Routing: – Supports Pre-Determined Routing Topology – For example: Clos Routing, Matrix Row/Column, etc – Automatic Cabling Verification for Large Installations

30 30 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public IB Routing Challenges Static / Dynamic Routing – IB impliments Static Routing through Linear Forwarding Tables at each chip – Multi-LID Routing enables Dynamic Routing Credit Loops Cost Base Routing – Speed mismatches cause Store & Forward (vs. cut through) – SDR <> DDR <>QDR – 4X <> 12X – Short Haul <> Long Haul

31 31 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public Multi-LID Source-Based Routing Support Applications can implement “Dynamic” Routing for Contention Avoidance, Failover, Parallel Data Transfer 1,2,3,4 Spine SwitchesLeaf Switches

32 32 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public New IB Peripherals CPUs? Storage – SAN – NFS-RDMA Memory (coherent / non-coherent) Purpose built Processors? – Floating Point Processors – Graphics Processors – Pattern Matching Hardware – XML Processor

33 33 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public THANK YOU! Questions & Answers


Download ppt "1 © 2005 Cisco Systems, Inc. All rights reserved. Session Number Presentation_ID Cisco Public InfiniBand: Today and Tomorrow Jamie Riotto Sr. Director."

Similar presentations


Ads by Google