We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byShaun Smock
Modified about 1 year ago
1NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. High-end Routers & Modern Supercomputers Bob Newhall & Dan Lenoski Cisco Systems, Routing Technology Group NORDUnet 2003, Reykjavik – August 2003
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 222 Agenda Traditional Routers and Supercomputers Modern Routers and Supercomputers Comparison of Subsystems Conclusions
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 333 What’s a Router? Traditionally… PCI Bus1 PCI Bus2 PA-6 PA-4 PA-2 PA-5 PA-3 PA-1 I/O Bus PCI Bus0 ROM Flas h NVRAM Con/Aux PB FE PCMCIA-2 CPU Bus PB System Controller System Controller SDRAM (256 MB) SDRAM (256 MB) CPU MIPS CPU MIPS Secondary Cache SRAM Secondary Cache SRAM PCMCIA-1 Architecturally, routers have been like normal computers except: - Mechanical form factors, especially for IO - Embedded forwarding and routing SW
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 444 What’s a Supercomputer? Traditionally… Cray Y-MP 250 Gbyte/sec of interconnect bandwidth Cray Y-MP C90
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 555 Evolution of High-End Routers Increasing bandwidth of external connections: T1 -> DS3 -> OC3 -> OC12 -> OC48 -> OC192 -> OC768 1mbit/sec -> 40 gbit/sec Line speed increases require changes in router architecture to remove the central memory bottleneck and replace with distributed memories and central interconnect fabric
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 666 Evolution of High-End Routers Increased computational power for routing, forwarding and feature processing Larger systems (more line cards) desired by end customers to exploit DWDM capabilities and simplify operation of POPs
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 777 What’s a High-End Router today? Switch Fabric Route Processor(s) Linecards (8-16) T1 to OC-192 Interfaces Distributed Architecture with Crossbar Switch Fabric Multi-Gigabit Switching Capacity
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 888 The next-generation of High-End Routers Switch Fabric Route Processor(s) Linecards (100’s to 1000’s) T1 to OC-768 Interfaces Multi-Terabit Switching Capacity Multi-Chassis, Distributed Architecture with Multi-Stage Switch Fabric
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 999 Evolution of Supercomputers Move from globally clocked, ECL vector processors to distributed-memory uP based multiprocessors 250MHz C90 to 1-2GHz Pentium 4, Alpha, Power3 This architecture change driven by: Complexity and economics of building highest performance processors Commoditization of smaller-scale computers Not driven by programming desires of end-users Note that state-of-the-art processors can generate less than 10Gbit/sec of communication data
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 10 What’s a Supercomputer today? ASCII White at LLNL 8K processors in 512 nodes, 12TFLOPS Interconnect has connection BW of 1TByte/Sec Diagram and photo from LLNL ASCII webpage
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 11 Major components of a Router Distributed Control Plane Used to run routing protocols (= dist. computer) Distributed Data Plane Packet Processing: Examine L2-L7 protocol information (Determine QoS, VPN ID, policy, etc.) Packet Forwarding: Make appropriate routing, switching, and queuing decisions System Interconnect Control Plane – can be combined with data plane or dedicated Data Interconnect – at least sum of external BW required
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 12 Major components of a Supercomputer Distributed Control / Computational nodes Small number of processor nodes (4-16) with local memory Distributed IO Subsystem Typically tied to subset of nodes, but if fully distributed these can be viewed as sync/source of external bandwidth similar to router external connections System interconnect BW driven primarily by data sharing requirements and often limited by CPU’s ability to generate data
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 13 Router – Supercomputer Analogy High-End RouterSupercomputer Route ProcessorsCPU Nodes Line CardsI/O Nodes Switch FabricInterconnection Network
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 14 Route Processors ~ CPU Nodes Route Processors execute routing protocols and maintain routing and forwarding information bases Large networks dictate gigabytes of memory to hold routing and interface database Also require high-peak computation rates to reconverge network topology and download table updates to line cards 1000 MIPs per eight 40Gbit/sec interfaces for control plane CPU nodes in supercomputer run applications and source and sync processor communication traffic 1-2 Gflops and 1000 MIPs per processor 1-2 Gbytes of memory per processor
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 15 Router Line Card ~ SC I/O Node Packet forwarding, classification and feature processing require complex look-ups and queuing decisions be made on a per packet basis Even with HW assist (TCAMs, etc.) approximately 500 instructions per packet At 40Gbps and minimum size packet => 100MPPS Total of 50,000 MIPS / 40Gbps line rate Queuing and TCP/IP congestion semantics imply 200millisec of buffering on ingress and egress .2sec x 40Gbps x 2 = 16Gbits = 2Gbyte / 40Gbps line rate Fragmentation usually typically requires 4x BW queuing 40Gbps => 160Gpbs per queue x 2 (I & E) => 320Gbps
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 16 Table SRAM Fwd/Class TCAMs RTT Buffer Mem (1GB)+ pointer SRAM Distributed Memory Router Line Card Input Queuing Receive Fwd Engine Control CPU Mem Control Linecard Control CPU Fabric Re-Assem. Transmit Fwd Engine Output Queuing L2 Buffering Optics To Fabric From Fabric Framer RTT Buffer Mem (1GB)+ pointer SRAM Table SRAM Fwd/Class TCAMs 512+MB DRAM
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 17 Supercomputer I/O Nodes Disk and network attachment dominate requirements Computational requirements on data typically limits effective throughput 52 nodes of 512 on ASCII-White each with appox. 1-2Gbyte/sec per node of IO BW Data must be moved from IO to local node memory and then IPC’d to other computational nodes Limited by node to interconnect BW limits
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 18 Router Switch Fabric ~ SC Interconnect Network Critical design parameters are: Throughput Traffic Isolation Fault-Tolerance Router switch fabric must have over-speed of fabric BW to line BW to provide traffic isolation and deal with packet fragmentation Minimum 1.5x with at least 2x line rate desirable Gbps per 40Gbps line rate Depending size of system – topology varies from Crossbar Multistage Network (e.g., Benes, Clos) Must be symmetric – all-to-all (like old-style Supercomputer)
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 19 Supercomputer Interconnect Network Critical parameters are: Throughput Latency (end-to-end) Actual supercomputers interconnects vary substantially, but usually <1Gbyte/sec per processor Topology Varies, but generally exploits locality Hypercube Torus or Mesh Multi-stage networks
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 20 Overall Comparison Feature512 Linecard 40Gpbs/LC Router 512 node, 8K ASCII-White SuperComputer Control MIPS64 GIPS8000 GIPS Data MIPS25600 GIPSN/A Total Memory Storage 1024 Gbytes4096 Gbytes Total Memory Bandwidth 20 Tbyte/sec8 Tbyte/sec Interconnect Bandwidth 4 Tbyte/sec2 Tbyte/sec
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 21 Overall Technology Required Traditionally, networking equipment exploited off- the-shelf silicon, FPGA, standard ASIC technology High-end routers with OC-192 support approaching supercomputers 0.25u and 0.18u ASICs shipped in early 2001 High-end routers with OC-768 support require the leading edge of technology ASICs using 0.13u technology and >1500pin packages Latest memory technology Rambus, FCRAM and RLDRAM, QDR SRAM Power per rack comparable to the 9.5KW for IBM’s SP2
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 22 Conclusions Explosive data rates and optics capabilities have pushed router technology tremendously in the last decade From embedded single-board computers in the 80’s To distributed-memory computers with specialized forwarding, queuing and feature processing capabilities In nearly every metric of system technology, today’s high-end routers match or exceed the capability of an equivalent supercomputer In addition, high-end routers have a critical requirement of system fault-tolerance Going forward, advances in high-end routers and supercomputers are technology-limited
23NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. Thank you! Bob Newhall,
Lecture 4: Direct and Indirect Interconnection Networks for Distributed- Memory Multiprocessors Shantanu Dutt Univ. of Illinois at Chicago.
1 © 2001, Cisco Systems, Inc. All rights reserved. © 2002, Cisco Systems, Inc. All rights reserved. 12K Support Training.
Compiled by : S. Agarwal, Lecturer & Systems Incharge St. Xaviers Computer Centre, St. Xaviers College Kolkata. March-2003.
Routers Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
1AC_055_2000 © 2000, Cisco Systems, Inc. Fast IP Routing Axel Clauberg Consulting Engineer Cisco Systems Axel Clauberg Consulting.
Chapter 4 Objectives Identify the components of the central processing unit and how they work together and interact with memory Describe how program.
Network+ Guide to Networks 5 th Edition Chapter 6 Network Hardware.
What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. Operating system goals: Execute.
Distributed Computing Dr. Eng. Ahmed Moustafa Elmahalawy Computer Science and Engineering Department.
Ahmad Aljebaly Department of Computer Science Western Michigan University.
Computer Graphics Prof. Muhammad Saeed Dept. of Computer Science & IT Federal Urdu University of Arts, Sciences and Technology.
Computer Graphics Prof. Muhammad Saeed. 2 Hardware ( Graphic Cards ) II Hardware II Computer Graphics 1 August 2012.
Multiprocessing and NUMA. What we sort of assumed so far… Northbridge connects CPU and memory to rest of system – Memory controller implemented in Northbridge.
Computer Buses Ref: Burd, Chp – 220 Englander, Chp 7 p Chp 8, p ,
1. OBJECTIVES: Defining the different types of buses Discussing bus arbitration and handshaking schemes Introducing I2C and PCI bus examples Interconnection.
CSE 413: Computer Networks Md. Kamrul Hasan Assistant Professor and Chairman Dept. of Computer and Communication Engineering Patuakhali Science and Technology.
The Central Processing Unit: What Goes on Inside the Computer Chapter 4.
Computer Systems Lecturer: Szabolcs Mikulas URL: Textbook: W. Stallings,
A computer network, often simply referred to as a network, is a collection of computers and devices interconnected by communications channels that facilitate.
PCI Bus CENG Spring Dr. Yuriy ALYEKSYEYENKOV 2 The PCI (Peripheral Component Interconnect) bus was developed as a low-cost, processor-independent.
Copyright 2011 John Wiley & Sons, Inc Business Data Communications and Networking 11th Edition Jerry Fitzgerald and Alan Dennis John Wiley & Sons,
1 ENTS689L: Packet Processing and Switching Switch Fabric Basics Switch Fabric Basics Vahid Tabatabaee Fall 2006.
1 CSE 380 Computer Operating Systems Instructor: Insup Lee University of Pennsylvania Fall 2003 Lecture Notes: Multiprocessors (updated version)
Network Services for Enhanced Cloud Computing T. V. Lakshman Bell Labs (Jointly with F. Hao, S. Mukherjee, H. Song)
SCSC 311 Information Systems: hardware and software.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 M EMORY H IERARCHY D ESIGN Computer Architecture A Quantitative Approach, Fifth Edition.
Computing Systems Organization CT101 – Computing Systems.
HARDWARE Rashedul Hasan.. HARDWARE Hardware is the Tangible part/s of the computer. Along with the Processor, RAM, CD-ROM and Input & Output devices,
JR.S00 1 Lecture 15: Busses and Networking (1) Prof. Jan Rabaey Computer Science 252, Spring 2000 Based on slides from Dave Patterson, John Kubiatowicz.
POWER-AWARE NETWORK DESIGN «Power Awareness in Network Design and Routing» J. Chabarek et al. «Energy-Minimized Design for IP Over WDM Networks» G. Shen,
© 2016 SlidePlayer.com Inc. All rights reserved.