平成25年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC)

Slides:



Advertisements
Similar presentations
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Advertisements

1 UNIT I (Contd..) High-Speed LANs. 2 Introduction Fast Ethernet and Gigabit Ethernet Fast Ethernet and Gigabit Ethernet Fibre Channel Fibre Channel High-speed.
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
Distributed Systems Architectures
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.
Processes and Operating Systems
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
and 6.855J Spanning Tree Algorithms. 2 The Greedy Algorithm in Action
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
1 Dynamic Interconnection Networks Buses CEG 4131 Computer Architecture III Miodrag Bolic.
Best of Both Worlds: A Bus-Enhanced Network on-Chip (BENoC) Ran Manevich, Isask har (Zigi) Walter, Israel Cidon, and Avinoam Kolodny Technion – Israel.
CP2073 Networking Lecture 5.
Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 35 – Buses.
Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN Fall.
Shantanu Dutt Univ. of Illinois at Chicago
Augmenting FPGAs with Embedded Networks-on-Chip
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) The Black Widow High Radix Clos Network S. Scott, D.Abts, J. Kim, and W.
Dr. Wei Chen ( ), Professor Tennessee State University Lectures on Parallel and Distributed Computing.
Local Area Networks - Internetworking
Introduction to Network
EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.
Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis
Computer Maintenance Unit Subtitle: CPUs Copyright © Texas Education Agency, All rights reserved.1.
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
VOORBLAD.
15. Oktober Oktober Oktober 2012.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 Uniform memory access (UMA) Each processor has uniform access time to memory - also known as symmetric multiprocessors (SMPs) (example: SUN ES1000) Non-uniform.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
© 2012 National Heart Foundation of Australia. Slide 2.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 7 Ethernet Technologies.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
25 seconds left…...
Analyzing Genes and Genomes
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
© DEEDS – OS Course WS11/12 Lecture 10 - Multiprocessing Support 1 Administrative Issues  Exam date candidates  CW 7 * Feb 14th (Tue): * Feb 16th.
Clock will move after 1 minute
Intracellular Compartments and Transport
PSSA Preparation.
Chapter 11 Modern Computer Systems, Clusters, and Networks
Essential Cell Biology
The University of Adelaide, School of Computer Science
Today’s topics Single processors and the Memory Hierarchy
Storage area network and System area network (SAN)
NC論2 1 ネットワークコンピューティング 論Ⅱ 平成 26 年度 後期 火曜 第2時限(10:40-1 2:10) 吉永 努(UEC )
Interconnect Networks
NC論2 1 ネットワークコンピューティング 論Ⅱ 平成 21 年度 後期 火曜 第2時限(10:40-1 2:10) 吉永 努(UEC )
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
NC論2 1 ネットワークコンピューティング 論Ⅱ 平成 24 年度 後期 火曜 第2時限(10:40-1 2:10) 吉永 努(UEC )
INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Interconnection Networks Communications Among Processors.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
平成29年度 後期 月曜 第3時限(13:00-14:30) 吉永 努(UEC)
平成30年度 後期 月曜 第3時限(13:00-14:30) 吉永 努(UEC)
Presentation transcript:

平成25年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC) yosinaga@is.uec.ac.jp ネットワークコンピューティング論Ⅱ 平成25年度 後期 火曜 第2時限(10:40-12:10) 吉永 努(UEC) yosinaga@is.uec.ac.jp NC論2

内 容 分散・並列処理計算機における相互結合ネットワークとその上でのメッセージ・ルーティング技法などについて学ぶ 資料 http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html http://ceng.usc.edu/smart/presentations/archives/AppendixE.ppt (253 slides, 13MB) http://booksite.mkp.com/9780123838728/references/appendix_f.pdf (P.118, 2MB) TA: 重信 裕政君 shige@comp.is.uec.ac.jp NC論2

References T. M. Pinkston and J. Duato: Interconnection Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006). 5th Edition, Morgan Kaufmann publishers (2011). J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks - an Engineering Approach-, 第2版, Morgan Kaufmann publishers (2003) 富田眞治: 並列コンピュータ、昭晃堂(1996) W.D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan Kaufmann publishers (2003)

What is an interconnection Network? It is a programmable system that transports data between terminals, such as processors and memory. It is programmable in the sense that it makes different connections at different points. It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data. NC論2

Interconnection Network (1/2) P P P M M M Multicomputer NC論2

Interconnection Network (2/2) P P P Interconnection Network M M M UMA type shared memory multiprocessor It is also called dance-hall architecture. NC論2

Trend Its performance is increasing with processor performance at a rate of 50% per year. Communication is a limiting factor in the performance of many modern systems. Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over. NC論2

Computer Classifications (%) 2013/06 2012/06 2011/06 MPP 16.6 18.6 17.4 Cluster 83.4 81.4 82.2 Others 0.0 0.4 http://www.top500.org/ share of the TOP500 June, 2013 – June, 2011 NC論2

Infiniband QDR (40Gbps) ×2 Examples of clusters Processors Accelerator Interconnect Tianhe-2 (天河2号)China 2013 Intel Xeon E5-2692 12C 2.2 GHz×2 ×16K Xeon Phi 31S1P (57 cores)×3 ×16K TH Express-2 (proprietary) Fat tree Tsubame 2.5 Tokyo Tech. Xeon X5670 2.93GHz×2 ×1,408 NVIDIA Kepler K20x ×3×1,048 Infiniband QDR (40Gbps) ×2 NC論2

Examples of MPPs #core Rmax Node Topology K computer @RIKEN Fujitsu 2011 SPARC64 VIIIfx 2 GHz (16 GFlops× 8 cores) 6D mesh/ 3D torus Tofu interconnect 80K-node x 8-core = 640K-core 10.51 PFlops 7,890 KW Titan@ORNL Cray XK7 2012 AMD Opteron 16C 2.2 GHz + NVIDIA K20x Gemini interconnect 18,688 nodes (200 Cabinets) 27.11 PFlops 8,209 KW NC論2

Other Networks of Supercomputers Sequoia (2011): 5D torus, proprietary IBM SeaStar Pleiades / NASA (2011): partial 11D hypercube topology with IB QDR/DDR Red Sky/ Sandia National Lab. (2010): 3D torus (12 bristled node) with IB QDR switches IBM Roadrunner (2009): fat-tree with IB DDR Earth Simulator2 / NEC SX-9E (2009): Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes) IBM Blue Gene/L (2004): 3D torus proprietary (64 x 32 x 32 = 64K nodes) NC論2

Architecture vs. software memory programming UMA (SMP) shared OpenMP NUMA (MPP) distributed (not shared) MPI (Message Passing Interface) NC論2

Network Design (1/3) Performance: latency and throughput (bandwidth) Scalability: #processors vs. network, memory, I/O bandwidth Incremental expandability: small to maximum size Partitionability: netwrok may be partitioned for several users NC論2

Network Design (2/3) Simplicity: simple design, higher clock frequency, easy to use Distance span: smaller system is preferred for noise and cable delay, etc. Physical constraints: packaging (pin count), wiring(wire length), and maintenance (power consumption) should meet physical limitation. NC論2

Network Design (3/3) Reliability: fault tolerant, reliable communication, hot swap Expected workload: robust performance over a wade range of traffic conditions. Cost: trade-offs between cost and performance. NC論2

Classifiction of Interconnection Networks Shared-Medium Networks Local area networks (ethernet, token ring) Backplane bus (e.g. SUN Gigaplane) Direct Networks (router-based) mesh, torus, hypercube, tree, … etc. Indirect Networks (switch-based) Hybrid Networks NC論2

Shared-Medium Networks (LAN) Arbitration that determines the mastership of the shared-medium network to resolve network access is needed. The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD). Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time. NC論2

Shared-Medium Networks (Backplane bus) It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture. It is realized by printed lines on a circuit board by discrete wiring. Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock. NC論2

Direct (static) Networks Consists of a set of nodes. Each node is directly connected to a subset of other nodes in the network. Examples: 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine) 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E) Hypercube (CM1, CM2, nCUBE) NC論2

Mesh topology node 2D 3D NC論2

Torus topology 2D (4-ary 2-cube) 3D (3-ary 3-cube) NC論2

Hypercube (binary n-cube) 4D (2-ary 4-cube) NC論2

tree x tree Binary tree fat tree NC論2

Hierarchical topology (1/2) Pyramid (Hierarchical 2D mesh) Hierarchical ring NC論2

Hierarchical topology (2/2) Cube-connected cycles RDT (Recursive Diagonal Torus) NC論2

Hypermesh (spaninng-bus hypercube) Single or multiple buses NC論2

Base-m n-cube (hyper-crossbar) 770 777 070 077 707 000 007 8x8 crossbar Base-8 3-cube (Toshiba Prodigy) NC論2

Diameter and degrees (1/2) 2D mesh 2D torus 3D torus binary n-cube #node N N N = 2n Diameter 2√N √N log N degree 4 6 3 NC論2

Diameter and degrees (2/2) Base-m n-cube CCC Binary tree ring #node N = mn N = n2n N Diameter logm N 3n/2 2log N N/2 degree 3 2 3 NC論2