IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)

Slides:



Advertisements
Similar presentations
Distributed Systems CS
Advertisements

SE-292 High Performance Computing
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
PARAM Padma SuperComputer
CM-5 Massively Parallel Supercomputer ALAN MOSER Thinking Machines Corporation 1993.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Today’s topics Single processors and the Memory Hierarchy
GWDG Matrix Transpose Results with Hybrid OpenMP / MPI O. Haan Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen, Germany ( GWDG ) SCICOMP.
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.
Types of Parallel Computers
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
History of Distributed Systems Joseph Cordina
Multiprocessors ELEC 6200 Computer Architecture and Design Instructor: Dr. Agrawal Yu-Chun Chen 10/27/06.
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
Storage area network and System area network (SAN)
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
CS 6560 Operating System Design Lecture 1. Overview 1.1 What is an operating system 1.2 History of operating systems 1.3 The operating system zoo 1.4.
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
Multiprocessor systems Objective n the multiprocessors’ organization and implementation n the shared-memory in multiprocessor n static and dynamic connection.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Fall 2000M.B. Ibáñez Lecture 01 Introduction What is an Operating System? The Evolution of Operating Systems Course Outline.
Edgar Gabriel Short Course: Advanced programming with MPI Edgar Gabriel Spring 2007.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Sun Fire™ E25K Server Keith Schoby Midwestern State University June 13, 2005.
1 Lecture 11: Computer Systems and Networks ITEC 1000 “Introduction to Information Technology”
MIMD Distributed Memory Architectures message-passing multicomputers.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
COMPUTER ARCHITECTURE. Recommended Text 1Computer Organization and Architecture by William Stallings 2Structured Computer Organisation Andrew S. Tanenbaum.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
Components of a Sysplex. A sysplex is not a single product that you install in your data center. Rather, a sysplex is a collection of products, both hardware.
Computer Organization & Assembly Language © by DR. M. Amer.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Lec 6 Chap. 13Multiprocessors
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Background Computer System Architectures Computer System Software.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Ryan Leonard Storage and Solutions Architect
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Constructing a system with multiple computers or processors
Outline Interconnection networks Processor arrays Multiprocessors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Types of Parallel Computers
Cluster Computers.
Presentation transcript:

IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4) Type processors (Current high- end configurations use POWER3) Architecture shared memory MIMD Operating system: AIX, a 64-bit Unix system. Each node has its own operating system.

Overview Distributed memory, multinode server designed for demanding technical and commercial workloads Versatile system running serial, symmetric multiprocessor (SMP) and parallel workloads all managed from a central point-of-control. Flexible configurability - node types (thin, wide, high) - up to 512 nodes per system (by special order)

IBM POWER3 processor Block Diagram

Node architectures 3 kinds of node architectures: Thin,wide and high nodes. Currently the most commonly used is SP POWER3 SMP high node architecture -As much as 16 POWER3 processors/node with as much as 64 GB memory. Scalability: Same technology, system of nodes.

High node architecture As much as 4 processor cards with each having up to 4 processors. Node Controller chips: 4GB/s bandwidth/ processor, 16GB/s bandwidth to the Active Backplane Planar. Memory and I/O functions have 16GB/s bandwidth. Inside the node: tree topology.

Node architecture- Processor-to-memory connection

Communication Network -SP switch used to interconnect nodes 2 basic components: -communications adapter node- switching board connection -switch board -SP Switch and SP Switch 2 (on high nodes)

Communication network SP Switch2 is used to connect nodes into a supercomputer. Communication Subsystem (CSS), consists of hardware and software Communication path, monitoring of the switch hardware, controling the network, error detection and recovery action. Multistage switching technology

SP Switch =32 links (for nodes+for other switches) For large networks, switch boards have to be connected together. 8 node switch board for when need no more than 8 nodes.

SP Switch2 CONNECTION: 2-80 nodes: maximum of 5 switch boards using star topology (Data passing through at most 2 switch boards.) nodes: At least 6 switch boards -> Star topology not possible -> additional boards used as intermediate switch boards nodes: 2 frames of switch boards (32 NSBs times 16 ISBs equals 512 nodes.)

The IBM SP switch board

Parallel programming with RS6000 SP Recommended choices for writing parallel programs: MPI and OpenMP. If high performance is desired and code portability is not an issue, the Low-level Application Programming Interface can be used. PVM, data parallel language HPF not used for program development,problems with portability and performance. Natural programming model is message passing. (Within a node shared memory programming also possible. )

Example System NCAR (National Center for Atmospheric Research) Blackforest a cluster system with hundreds of 4- processor nodes running the AIX.

Example System Hardware: -293 WinterHawk II RS/6000 nodes for batch jobs -4 identical WinterHawk II nodes dedicated to interactive login sessions. - 2 NightHawk II RS/6000 nodes - NightHawk II RS/6000 node dedicated to data analysis + Spare WinterHawk II nodes -L1 cache: 32-KB 128-way instruction cache and 64-KB 128- way data cache/processor -L2 cache: 8 MB instruction and data cache/processor

Example System -WinterHawk II memory size: 2 GB memory per WinterHawk II node, 512 KB memory/processor, 586 GB distributed memory for WinterHawk II compute nodes. -NightHawk II memory size: 24 GB of memory /nodes, 1.5 GB memory per processor. -Disk capacity: 13 TB total -Clock speed: 375 MHz -HiPPI to the Mass Storage System plus 100BaseT and Gigabit Ethernet network connections