BLUE GENE Sunitha M. Jenarius. What is Blue Gene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting.

Slides:



Advertisements
Similar presentations
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Blue Gene/P System Overview - Hardware.
Advertisements

Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
A Complete GPU Compute Architecture by NVIDIA Tamal Saha, Abhishek Rawat, Minh Le {ts4rq, ar8eb,
PARAM Padma SuperComputer
Commodity Computing Clusters - next generation supercomputers? Paweł Pisarczyk, ATM S. A.
Case study IBM Bluegene/L system InfiniBand. Interconnect Family share for 06/2011 top 500 supercomputers Interconnect Family CountShare % Rmax Sum (GF)
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Top 500 Computers Federated Distributed Systems Anda Iamnitchi.
Today’s topics Single processors and the Memory Hierarchy
Zhao Lixing.  A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.  Supercomputers.
Claude TADONKI Mines ParisTech – LAL / CNRS / INP 2 P 3 University of Oujda (Morocco) – October 7, 2011 High Performance Computing Challenges and Trends.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
Introduction CS 524 – High-Performance Computing.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Lecture 1: Introduction to High Performance Computing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Interconnection and Packaging in IBM Blue Gene/L Yi Zhu Feb 12, 2007.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Computer performance.
Programming for High Performance Computers John M. Levesque Director Cray’s Supercomputing Center Of Excellence.
Blue Gene / C Cellular architecture 64-bit Cyclops64 chip: –500 Mhz –80 processors ( each has 2 thread units and a FP unit) Software –Cyclops64 exposes.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Atlanta, Georgia TiNy Threads on BlueGene/P: Exploring Many-Core Parallelisms Beyond The Traditional OS Handong Ye, Robert Pavel, Aaron Landwehr, Guang.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
Bulk Synchronous Parallel Processing Model Jamie Perkins.
Overview of the New Blue Gene/L Computer Dr. Richard D. Loft Deputy Director of R&D Scientific Computing Division National Center for Atmospheric Research.
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Rensselaer Why not change the world? Rensselaer Why not change the world? 1.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
BlueGene/L Facts Platform Characteristics 512-node prototype 64 rack BlueGene/L Machine Peak Performance 1.0 / 2.0 TFlops/s 180 / 360 TFlops/s Total Memory.
IM&T Vacation Program Benjamin Meyer Virtualisation and Hyper-Threading in Scientific Computing.
The IBM Blue Gene/L System Architecture Presented by Sabri KANTAR.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Argonne Leadership Computing Facility ALCF at Argonne  Opened in 2006  Operated by the Department of Energy’s Office of Science  Located at Argonne.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation Ning Liu, Christopher Carothers 1.
Performance Benefits on HPCx from Power5 chips and SMT HPCx User Group Meeting 28 June 2006 Alan Gray EPCC, University of Edinburgh.
Interconnection network network interface and a case study.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
BluesGene/L Supercomputer A System Overview Pietro Cicotti October 10, 2005 University of California, San Diego.
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
Architecture of Parallel Computers CSC / ECE 506 BlueGene Architecture 4/26/2007 Dr Steve Hunter.
PARALLEL MODEL OF EVOLUTIONARY GAME DYNAMICS Amanda Peters MIT /13/2009.
Super Computing By RIsaj t r S3 ece, roll 50.
Architecture & Organization 1
Modern Processor Design: Superscalar and Superpipelining
BlueGene/L Supercomputer
Architecture & Organization 1
Course Description: Parallel Computer Architecture
Vrije Universiteit Amsterdam
Facts About High-Performance Computing
Presentation transcript:

BLUE GENE Sunitha M. Jenarius

What is Blue Gene A massively parallel supercomputer using tens of thousands of embedded PowerPC processors supporting a large memory space With standard compilers and message passing environment

Why the name “Blue Gene”? “Blue”: The corporate color of IBM “Gene”: The intended use of the Blue Gene clusters – Computational biology, specifically, protein folding

History Dec’99, IBM Research announced $100M US effort to build a Petaflop scale supercomputer. Two goals of The Blue Gene project : – Massively parallel machine architecture and software – Bio-Molecular Simulation – advance orders of magnitude November 2001, Partnership with Lawrence Livermore National Laboratory (LLNL) and this resulted in …

Results Linpack Top 500 Supercomputers

Blue Gene Projects Four Blue Gene projects : – BlueGene/L – BlueGene/C – BlueGene/P – BlueGene/Q

Blue Gene/L The first computer in the Blue Gene series IBM first announced the Blue Gene/L project, Sept. 29, 2004 Final configuration was launched in October 2005

Blue Gene/L - Unsurpassed Performance Designed to deliver the most performance per kilowatt of power consumed Theoretical peak performance of 360 TFLOPS Final Configuration (Oct. ‘05) scores over 280 TFLOPS sustained on the Linpack benchmark. Nov 14, ‘06, at Supercomputing 2006, Blue Gene/L was awarded the winning prize in all HPC Challenge Classes of awards.

Blue Gene/L Architecture Can be scaled up to 65,536 compute or I/O nodes, with 131,072 processors Each node is a single ASIC with associated DRAM memory chips Each ASIC has MHz IBM PowerPC processors PowerPC processors – Low-frequency, low-power embedded processors, superior to today's high-frequency, high-power microprocessors by a factor of 2 or more

Blue Gene/L Architecture contd… – Double-pipeline-double-precision Floating Point Unit – A cache sub-system with built-in DRAM controller Node CPUs are not cache coherent with one another FPUs and CPUs are designed for low power consumption – Using transistors with low leakage current – Local clock gating – Putting the FPU or CPU/FPU pair to sleep

Blue Gene/L Architecture contd… 1024 nodes System Overview

Blue Gene/L Architecture contd… 1 rack holds 1024 nodes or 2048 processors Nodes optimized for low power consumption ASIC based on System-on-a-chip technology – Large numbers of low-power system-on-a-chip technology allows it to outperform commodity clusters while saving on power – Aggressive packaging of processors, memory and interconnect – Power Efficient & Space Efficient – Allows for latencies and bandwidths that are significantly better than those for nodes typically used in ASC scale supercomputers

Blue Gene/L Networks Each node is attached to 3 main parallel communication networks – 3D Torus network - peer-2-peer between compute nodes – Collective network – collective & global communication – Ethernet network - I/O and management (such as access to any node for configuration, booting and diagnostics )

Blue Gene/L System Software System software supports efficient execution of parallel applications Compiler support for DFPU (C, C++, Fortran) Compute nodes use a minimal operating system called “BlueGene/L compute node kernel” – A lightweight, single-user operating system – Supports execution of a single dual-threaded application compute process – Kernel provides a single and static virtual address space to one running compute process – Because of single-process nature, no context switching required

Blue Gene/L System Software contd… To allow multiple programs to run concurrently – Blue Gene/L system can be partitioned into electronically isolated sets of nodes – The number of nodes in a partition must be a positive integer power of 2 – To run program – reserve this partition – No other program can use till partition is done with current program – With so many nodes, component failures are inevitable. The system is able to electrically isolate faulty hardware to allow the machine to continue to run

Blue Gene/L System Software contd… Parallel Programming model – Message Passing – supported through an implementation of MPI – Only a subset of POSIX calls are supported – Green threads are also used to simulate local concurrency

Blue Gene/C Sister-project to BlueGene/L Renamed to Cyclops64 Massively parallel, supercomputer-on-a-chip cellular architecture Cellular architecture gives the programmer the ability to run large numbers of concurrent threads within a single processor.

Blue Gene/P Architecturally similar to BlueGene/L Expected to operate around one petaflop Expected around 2008

Blue Gene/Q Last known supercomputer in the Blue Gene series Expected to reach 3-10 petaflops

Resources Wikipedia.org IBM website – ( ne.html) ne.html pap207.pdf