平成25年度後期火曜第２時限（１０：４０－１２：１０）吉永努（ＵＥＣ)

Slides:

Advertisements

Similar presentations

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.

Advertisements

1 UNIT I (Contd..) High-Speed LANs. 2 Introduction Fast Ethernet and Gigabit Ethernet Fast Ethernet and Gigabit Ethernet Fibre Channel Fibre Channel High-speed.

Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)

Distributed Systems Architectures

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

Chapter 1 The Study of Body Function Image PowerPoint

1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 4 Computing Platforms.

Processes and Operating Systems

Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.

1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.

Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.

and 6.855J Spanning Tree Algorithms. 2 The Greedy Algorithm in Action

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.

Year 6 mental test 10 second questions

1 Dynamic Interconnection Networks Buses CEG 4131 Computer Architecture III Miodrag Bolic.

Best of Both Worlds: A Bus-Enhanced Network on-Chip (BENoC) Ran Manevich, Isask har (Zigi) Walter, Israel Cidon, and Avinoam Kolodny Technion – Israel.

CP2073 Networking Lecture 5.

Copyright © 2007 Heathkit Company, Inc. All Rights Reserved PC Fundamentals Presentation 35 – Buses.

Comparison Of Network On Chip Topologies Ahmet Salih BÜYÜKKAYHAN Fall.

Shantanu Dutt Univ. of Illinois at Chicago

Augmenting FPGAs with Embedded Networks-on-Chip

© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) The Black Widow High Radix Clos Network S. Scott, D.Abts, J. Kim, and W.

Dr. Wei Chen ( ), Professor Tennessee State University Lectures on Parallel and Distributed Computing.

Local Area Networks - Internetworking

Introduction to Network

EU Market Situation for Eggs and Poultry Management Committee 21 June 2012.

Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Power Analysis

Computer Maintenance Unit Subtitle: CPUs Copyright © Texas Education Agency, All rights reserved.1.

CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.

IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.

15. Oktober Oktober Oktober 2012.

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

1 Uniform memory access (UMA) Each processor has uniform access time to memory - also known as symmetric multiprocessors (SMPs) (example: SUN ES1000) Non-uniform.

Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)

© 2012 National Heart Foundation of Australia. Slide 2.

1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 7 Ethernet Technologies.

Understanding Generalist Practice, 5e, Kirst-Ashman/Hull

25 seconds left…...

Analyzing Genes and Genomes

©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.

Essential Cell Biology

© DEEDS – OS Course WS11/12 Lecture 10 - Multiprocessing Support 1 Administrative Issues  Exam date candidates  CW 7 * Feb 14th (Tue): * Feb 16th.

Clock will move after 1 minute

Intracellular Compartments and Transport

PSSA Preparation.

Chapter 11 Modern Computer Systems, Clusters, and Networks

Essential Cell Biology

The University of Adelaide, School of Computer Science

Today’s topics Single processors and the Memory Hierarchy

Storage area network and System area network (SAN)

ＮＣ論２ 1 ネットワークコンピューティング論Ⅱ 平成 26 年度後期火曜第２時限（１０：４０－１２：１０）吉永努（ＵＥＣ )

Interconnect Networks

ＮＣ論２ 1 ネットワークコンピューティング論Ⅱ 平成 21 年度後期火曜第２時限（１０：４０－１２：１０）吉永努（ＵＥＣ )

A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.

Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.

Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.

ＮＣ論２ 1 ネットワークコンピューティング論Ⅱ 平成 24 年度後期火曜第２時限（１０：４０－１２：１０）吉永努（ＵＥＣ )

INTERCONNECTION NETWORKS Work done as part of Parallel Architecture Under the guidance of Dr. Edwin Sha By Gomathy Gowri Narayanan Karthik Alagu Dynamic.

Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.

Interconnection Networks Communications Among Processors.

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

平成29年度後期月曜第3時限（13：00－14：30）吉永努（ＵＥＣ)

平成30年度後期月曜第3時限（13：00－14：30）吉永努（ＵＥＣ)

Presentation transcript:

平成25年度後期火曜第２時限（１０：４０－１２：１０）吉永努（ＵＥＣ) yosinaga@is.uec.ac.jp ネットワークコンピューティング論Ⅱ 平成25年度　後期火曜　第２時限（１０：４０－１２：１０）吉永　努（ＵＥＣ) yosinaga@is.uec.ac.jp ＮＣ論２

内容分散・並列処理計算機における相互結合ネットワークとその上でのメッセージ・ルーティング技法などについて学ぶ資料　http://comp.is.uec.ac.jp/yoshinagalab/yoshinaga/dp2.html http://ceng.usc.edu/smart/presentations/archives/AppendixE.ppt (253 slides, 13MB) http://booksite.mkp.com/9780123838728/references/appendix_f.pdf (P.118, 2MB) TA: 重信裕政君　shige@comp.is.uec.ac.jp ＮＣ論２

References T. M. Pinkston and J. Duato: Interconnection Networks, Appendix E in Computer Architecture: A Quantitative Approach, 4th Edition, Morgan Kaufmann publishers (2006). 5th Edition, Morgan Kaufmann publishers (2011). J. Duato, S. Yalamanchili, L. Ni: Interconnection Networks - an Engineering Approach-, 第2版, Morgan Kaufmann publishers (2003) 富田眞治：　並列コンピュータ、昭晃堂（1996） W.D. Dally, B. Towles: Principles and Practices of Interconnection Networks, Morgan Kaufmann publishers (2003)

What is an interconnection Network? It is a programmable system that transports data between terminals, such as processors and memory. It is programmable in the sense that it makes different connections at different points. It is a system because it is composed of many components: buffers, channels, switches, and controls that works together to deliver data. ＮＣ論２

Interconnection Network (1/2) P P P M M M Multicomputer ＮＣ論２

Interconnection Network (2/2) P P P Interconnection Network M M M UMA type shared memory multiprocessor It is also called dance-hall architecture. ＮＣ論２

Trend Its performance is increasing with processor performance at a rate of 50% per year. Communication is a limiting factor in the performance of many modern systems. Buses have been unable to keep up with the bandwidth demand, and point-to-point interconnection networks are rapidly taking over. ＮＣ論２

Computer Classifications (%) 2013/06 2012/06 2011/06 MPP 16.6 18.6 17.4 Cluster 83.4 81.4 82.2 Others 0.0 0.4 http://www.top500.org/ share of the TOP500 June, 2013 – June, 2011 ＮＣ論２

Infiniband QDR (40Gbps) ×2 Examples of clusters Processors Accelerator Interconnect Tianhe-2 (天河2号）China 2013 Intel Xeon E5-2692 12C 2.2 GHz×2 ×16K Xeon Phi 31S1P (57 cores)×3 ×16K TH Express-2 (proprietary) Fat tree Tsubame 2.5 Tokyo Tech. Xeon X5670 2.93GHz×2 ×1,408 NVIDIA Kepler K20x ×3×1,048 Infiniband QDR (40Gbps) ×2 ＮＣ論２

Examples of MPPs #core Rmax Node Topology K computer @RIKEN Fujitsu 2011 SPARC64 VIIIfx 2 GHz (16 GFlops× 8 cores) 6D mesh/ 3D torus Tofu interconnect 80K-node x 8-core = 640K-core 10.51 PFlops 7,890 KW Titan@ORNL Cray XK7 2012 AMD Opteron 16C 2.2 GHz + NVIDIA K20x Gemini interconnect 18,688 nodes (200 Cabinets) 27.11 PFlops 8,209 KW ＮＣ論２

Other Networks of Supercomputers Sequoia (2011): 5D torus, proprietary IBM SeaStar Pleiades / NASA (2011): partial 11D hypercube topology with IB QDR/DDR Red Sky/ Sandia National Lab. (2010): 3D torus (12 bristled node) with IB QDR switches IBM Roadrunner (2009): fat-tree with IB DDR Earth Simulator2 / NEC SX-9E (2009): Fat-Tree (64GB/s/cpu, 8-CPU/node, 160 nodes) IBM Blue Gene/L (2004): 3D torus proprietary (64 x 32 x 32 = 64K nodes) ＮＣ論２

Architecture vs. software memory programming UMA (SMP) shared OpenMP NUMA (MPP) distributed (not shared) MPI (Message Passing Interface) ＮＣ論２

Network Design (1/3) Performance: latency and throughput (bandwidth) Scalability: #processors vs. network, memory, I/O bandwidth Incremental expandability: small to maximum size Partitionability: netwrok may be partitioned for several users ＮＣ論２

Network Design (2/3) Simplicity: simple design, higher clock frequency, easy to use Distance span: smaller system is preferred for noise and cable delay, etc. Physical constraints: packaging (pin count), wiring(wire length), and maintenance (power consumption) should meet physical limitation. ＮＣ論２

Network Design (3/3) Reliability: fault tolerant, reliable communication, hot swap Expected workload: robust performance over a wade range of traffic conditions. Cost: trade-offs between cost and performance. ＮＣ論２

Classifiction of Interconnection Networks Shared-Medium Networks Local area networks (ethernet, token ring) Backplane bus (e.g. SUN Gigaplane) Direct Networks (router-based) mesh, torus, hypercube, tree, … etc. Indirect Networks (switch-based) Hybrid Networks ＮＣ論２

Shared-Medium Networks (LAN) Arbitration that determines the mastership of the shared-medium network to resolve network access is needed. The most well-known protocol is carrier-sense multiple access with collision detection (CSMA/CD). Token bus and token ring pass a token from the owner which has the right to access the bus/ring and resolve nondeterministic waiting time. ＮＣ論２

Shared-Medium Networks (Backplane bus) It is commonly used to interconnect processor(s) and memory modules to provide SMP (Symmetrical Memory Processor) architecture. It is realized by printed lines on a circuit board by discrete wiring. Gigaplane in SUN Enterprise x000 server(1996): 2.6GB/s, 256 bits data, 42 bits address, 83.8MHz clock. ＮＣ論２

Direct (static) Networks Consists of a set of nodes. Each node is directly connected to a subset of other nodes in the network. Examples: 2D mesh (intel Paragon), 3D mesh (MIT J-Mahine) 2D torus (Fujitsu AP3000), 3D torus (Cray T3D, T3E) Hypercube (CM1, CM2, nCUBE) ＮＣ論２

Mesh topology node 2D 3D ＮＣ論２

Torus topology 2D (4-ary 2-cube) 3D (3-ary 3-cube) ＮＣ論２

Hypercube (binary n-cube) 4D (2-ary 4-cube) ＮＣ論２

tree x tree Binary tree fat tree ＮＣ論２

Hierarchical topology (1/2) Pyramid (Hierarchical 2D mesh) Hierarchical ring ＮＣ論２

Hierarchical topology (2/2) Cube-connected cycles RDT (Recursive Diagonal Torus) ＮＣ論２

Hypermesh (spaninng-bus hypercube) Single or multiple buses ＮＣ論２

Base-m n-cube (hyper-crossbar) 770 777 070 077 707 000 007 8x8 crossbar Base-8 3-cube (Toshiba Prodigy) ＮＣ論２

Diameter and degrees (1/2) 2D　mesh 2D torus 3D torus binary n-cube #node N ＮＮ = 2n Diameter ２√N √N log N degree ４６３ＮＣ論２

Diameter and degrees (2/2) Base-m n-cube CCC Binary tree ring #node N = mn Ｎ = n2n Ｎ Diameter logm N 3n/2 2log N N/2 degree 3 2 ３ＮＣ論２