We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byShaun Smock
Modified over 2 years ago
1NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. High-end Routers & Modern Supercomputers Bob Newhall & Dan Lenoski Cisco Systems, Routing Technology Group NORDUnet 2003, Reykjavik – August 2003
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 222 Agenda Traditional Routers and Supercomputers Modern Routers and Supercomputers Comparison of Subsystems Conclusions
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 333 What’s a Router? Traditionally… PCI Bus1 PCI Bus2 PA-6 PA-4 PA-2 PA-5 PA-3 PA-1 I/O Bus PCI Bus0 ROM Flas h NVRAM Con/Aux PB FE PCMCIA-2 CPU Bus PB System Controller System Controller SDRAM (256 MB) SDRAM (256 MB) CPU MIPS CPU MIPS Secondary Cache SRAM Secondary Cache SRAM PCMCIA-1 Architecturally, routers have been like normal computers except: - Mechanical form factors, especially for IO - Embedded forwarding and routing SW
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 444 What’s a Supercomputer? Traditionally… Cray Y-MP 250 Gbyte/sec of interconnect bandwidth Cray Y-MP C90
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 555 Evolution of High-End Routers Increasing bandwidth of external connections: T1 -> DS3 -> OC3 -> OC12 -> OC48 -> OC192 -> OC768 1mbit/sec -> 40 gbit/sec Line speed increases require changes in router architecture to remove the central memory bottleneck and replace with distributed memories and central interconnect fabric
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 666 Evolution of High-End Routers Increased computational power for routing, forwarding and feature processing Larger systems (more line cards) desired by end customers to exploit DWDM capabilities and simplify operation of POPs
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 777 What’s a High-End Router today? Switch Fabric Route Processor(s) Linecards (8-16) T1 to OC-192 Interfaces Distributed Architecture with Crossbar Switch Fabric Multi-Gigabit Switching Capacity
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 888 The next-generation of High-End Routers Switch Fabric Route Processor(s) Linecards (100’s to 1000’s) T1 to OC-768 Interfaces Multi-Terabit Switching Capacity Multi-Chassis, Distributed Architecture with Multi-Stage Switch Fabric
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 999 Evolution of Supercomputers Move from globally clocked, ECL vector processors to distributed-memory uP based multiprocessors 250MHz C90 to 1-2GHz Pentium 4, Alpha, Power3 This architecture change driven by: Complexity and economics of building highest performance processors Commoditization of smaller-scale computers Not driven by programming desires of end-users Note that state-of-the-art processors can generate less than 10Gbit/sec of communication data
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 10 What’s a Supercomputer today? ASCII White at LLNL 8K processors in 512 nodes, 12TFLOPS Interconnect has connection BW of 1TByte/Sec Diagram and photo from LLNL ASCII webpage
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 11 Major components of a Router Distributed Control Plane Used to run routing protocols (= dist. computer) Distributed Data Plane Packet Processing: Examine L2-L7 protocol information (Determine QoS, VPN ID, policy, etc.) Packet Forwarding: Make appropriate routing, switching, and queuing decisions System Interconnect Control Plane – can be combined with data plane or dedicated Data Interconnect – at least sum of external BW required
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 12 Major components of a Supercomputer Distributed Control / Computational nodes Small number of processor nodes (4-16) with local memory Distributed IO Subsystem Typically tied to subset of nodes, but if fully distributed these can be viewed as sync/source of external bandwidth similar to router external connections System interconnect BW driven primarily by data sharing requirements and often limited by CPU’s ability to generate data
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 13 Router – Supercomputer Analogy High-End RouterSupercomputer Route ProcessorsCPU Nodes Line CardsI/O Nodes Switch FabricInterconnection Network
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 14 Route Processors ~ CPU Nodes Route Processors execute routing protocols and maintain routing and forwarding information bases Large networks dictate gigabytes of memory to hold routing and interface database Also require high-peak computation rates to reconverge network topology and download table updates to line cards 1000 MIPs per eight 40Gbit/sec interfaces for control plane CPU nodes in supercomputer run applications and source and sync processor communication traffic 1-2 Gflops and 1000 MIPs per processor 1-2 Gbytes of memory per processor
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 15 Router Line Card ~ SC I/O Node Packet forwarding, classification and feature processing require complex look-ups and queuing decisions be made on a per packet basis Even with HW assist (TCAMs, etc.) approximately 500 instructions per packet At 40Gbps and minimum size packet => 100MPPS Total of 50,000 MIPS / 40Gbps line rate Queuing and TCP/IP congestion semantics imply 200millisec of buffering on ingress and egress .2sec x 40Gbps x 2 = 16Gbits = 2Gbyte / 40Gbps line rate Fragmentation usually typically requires 4x BW queuing 40Gbps => 160Gpbs per queue x 2 (I & E) => 320Gbps
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 16 Table SRAM Fwd/Class TCAMs RTT Buffer Mem (1GB)+ pointer SRAM Distributed Memory Router Line Card Input Queuing Receive Fwd Engine Control CPU Mem Control Linecard Control CPU Fabric Re-Assem. Transmit Fwd Engine Output Queuing L2 Buffering Optics To Fabric From Fabric Framer RTT Buffer Mem (1GB)+ pointer SRAM Table SRAM Fwd/Class TCAMs 512+MB DRAM
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 17 Supercomputer I/O Nodes Disk and network attachment dominate requirements Computational requirements on data typically limits effective throughput 52 nodes of 512 on ASCII-White each with appox. 1-2Gbyte/sec per node of IO BW Data must be moved from IO to local node memory and then IPC’d to other computational nodes Limited by node to interconnect BW limits
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 18 Router Switch Fabric ~ SC Interconnect Network Critical design parameters are: Throughput Traffic Isolation Fault-Tolerance Router switch fabric must have over-speed of fabric BW to line BW to provide traffic isolation and deal with packet fragmentation Minimum 1.5x with at least 2x line rate desirable 60-100Gbps per 40Gbps line rate Depending size of system – topology varies from Crossbar Multistage Network (e.g., Benes, Clos) Must be symmetric – all-to-all (like old-style Supercomputer)
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 19 Supercomputer Interconnect Network Critical parameters are: Throughput Latency (end-to-end) Actual supercomputers interconnects vary substantially, but usually <1Gbyte/sec per processor Topology Varies, but generally exploits locality Hypercube Torus or Mesh Multi-stage networks
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 20 Overall Comparison Feature512 Linecard 40Gpbs/LC Router 512 node, 8K ASCII-White SuperComputer Control MIPS64 GIPS8000 GIPS Data MIPS25600 GIPSN/A Total Memory Storage 1024 Gbytes4096 Gbytes Total Memory Bandwidth 20 Tbyte/sec8 Tbyte/sec Interconnect Bandwidth 4 Tbyte/sec2 Tbyte/sec
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 21 Overall Technology Required Traditionally, networking equipment exploited off- the-shelf silicon, FPGA, standard ASIC technology High-end routers with OC-192 support approaching supercomputers 0.25u and 0.18u ASICs shipped in early 2001 High-end routers with OC-768 support require the leading edge of technology ASICs using 0.13u technology and >1500pin packages Latest memory technology Rambus, FCRAM and RLDRAM, QDR SRAM Power per rack comparable to the 9.5KW for IBM’s SP2
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 22 Conclusions Explosive data rates and optics capabilities have pushed router technology tremendously in the last decade From embedded single-board computers in the 80’s To distributed-memory computers with specialized forwarding, queuing and feature processing capabilities In nearly every metric of system technology, today’s high-end routers match or exceed the capability of an equivalent supercomputer In addition, high-end routers have a critical requirement of system fault-tolerance Going forward, advances in high-end routers and supercomputers are technology-limited
23NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. Thank you! Bob Newhall, firstname.lastname@example.org
Router Architectures An overview of router architectures.
CS 4396 Computer Networks Lab Router Architectures.
ECE 526 – Network Processing Systems Design
1 A quick tutorial on IP Router design Optics and Routing Seminar October 10 th, 2000 Nick McKeown
EE 122: Router Design Kevin Lai September 25, 2002.
Router Architecture : Building high-performance routers Ian Pratt
What's inside a router? We have yet to consider the switching function of a router - the actual transfer of datagrams from a router's incoming links to.
Lecture Note on Switch Architectures. Function of Switch.
Application Layer 2-1 Chapter 4 Network Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A.
Data Center Networks and Basic Switching Technologies Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems.
Router Internals CS 4251: Computer Networking II Nick Feamster Fall 2008.
Router Internals CS 4251: Computer Networking II Nick Feamster Spring 2008.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
4: Network Layer4b-1 Router Architecture Overview Two key router functions: r run routing algorithms/protocol (RIP, OSPF, BGP) r switching datagrams from.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
Network Layer4-1 Chapter 4 Network Layer All material copyright J.F Kurose and K.W. Ross, All Rights Reserved Computer Networking: A Top Down.
Chapter 4 Queuing, Datagrams, and Addressing
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Computer Organization & Assembly Language © by DR. M. Amer.
CCNA 2 Week 1 Routers and WANs. Copyright © 2005 University of Bolton Welcome Back! CCNA 2 deals with routed networks You will learn how to configure.
Department of Computer and IT Engineering University of Kurdistan Computer Networks II Router Architecture By: Dr. Alireza Abdollahpouri.
Gigabit Routing on a Software-exposed Tiled-Microprocessor
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 High Speed Router Design Shivkumar Kalyanaraman Rensselaer Polytechnic Institute
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v2.2—1-1 MPLS Concepts Introducing Basic MPLS Concepts.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Spring 2000CS 4611 Router Construction Outline Switched Fabrics IP Routers Extensible (Active) Routers.
XStream: Rapid Generation of Custom Processors for ASIC Designs Binu Mathew * ASIC: Application Specific Integrated Circuit.
Forwarding1. 2 Key Network-Layer Functions r forwarding: move packets from router’s input to appropriate router output r routing: determine the path taken.
Chassis Architecture Brandon Wagner Office of Information Technology
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Jon Turner (and a cast of thousands) Washington University Design of a High Performance Active Router Active Nets PI Meeting - 12/01.
A Scalable, Cache-Based Queue Management Subsystem for Network Processors Sailesh Kumar, Patrick Crowley Dept. of Computer Science and Engineering.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
Today’s topics Single processors and the Memory Hierarchy
Storage area network and System area network (SAN)
SGI’2000Parallel Programming Tutorial Supercomputers 2 With the acknowledgement of Igor Zacharov and Wolfgang Mertz SGI European Headquarters.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
1 | © 2016 Infinera Copyright 3D: Future Transport Network Architectures.
Company and Product Overview Company Overview Mission Provide core routing technologies and solutions for next generation carrier networks Founded 1996.
CS 6401 Routing, Routers, Switching Fabrics Outline Link state routing Link weights Router Design / Switching Fabrics.
Spring 2002CS 4611 Router Construction Outline Switched Fabrics IP Routers Tag Switching.
Computer Networks Switching Professor Hui Zhang
CS 268: Lecture 12 (Router Design) Ion Stoica March 18, 2002.
© 2017 SlidePlayer.com Inc. All rights reserved.