Exascale Evolution www.openfabrics.org 1 Brad Benton, IBM March 15, 2010.

Slides:



Advertisements
Similar presentations
Issues of HPC software From the experience of TH-1A Lu Yutong NUDT.
Advertisements

Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Office of Science U.S. Department of Energy Bassi/Power5 Architecture John Shalf NERSC Users Group Meeting Princeton Plasma Physics Laboratory June 2005.
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Blue Gene/P System Overview - Hardware.
Exascale Computing: Challenges and Opportunities Ahmed Sameh and Ananth Grama NNSA/PRISM Center, Purdue University.
Appro Xtreme-X Supercomputers A P P R O I N T E R N A T I O N A L I N C.
Supercomputing Challenges at the National Center for Atmospheric Research Dr. Richard Loft Computational Science Section Scientific Computing Division.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Blue Waters An Extraordinary Computing Resource for Advancing.
2. Computer Clusters for Scalable Parallel Computing
Parallel Research at Illinois Parallel Everywhere
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
IDC HPC User Forum Conference Appro Product Update Anthony Kenisky, VP of Sales.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign HPC USER FORUM Stuttgart Germany October 2010 Merle Giles Private.
Storage area Network(SANs) Topics of presentation
HIGH PERFORMANCE COMPUTING ENVIRONMENT The High Performance Computing environment consists of high-end systems used for executing complex number crunching.
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
CS 213 Commercial Multiprocessors. Origin2000 System – Shared Memory Directory state in same or separate DRAMs, accessed in parallel Upto 512 nodes (1024.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Memory. Random Access Memory Defined What is memory? operating system and other system software that control the usage of the computer equipment application.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007 ECE 498AL, University of Illinois, Urbana-Champaign ECE 498AL Lecture 6: GPU as part of the PC Architecture.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
High Performance Computing G Burton – ICG – Oct12 – v1.1 1.
1 Advanced Storage Technologies for High Performance Computing Sorin, Faibish EMC NAS Senior Technologist IDC HPC User Forum, April 14-16, Norfolk, VA.
Chapter 2 Computer Clusters Lecture 2.3 GPU Clusters for Massive Paralelism.
National Center for Supercomputing Applications University of Illinois at Urbana-Champaign Future of High Performance Computing Thom Dunning National Center.
CCS machine development plan for post- peta scale computing and Japanese the next generation supercomputer project Mitsuhisa Sato CCS, University of Tsukuba.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Virtualization for Storage Efficiency and Centralized Management Genevieve Sullivan Hewlett-Packard
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
Rensselaer Why not change the world? Rensselaer Why not change the world? 1.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
HPC system for Meteorological research at HUS Meeting the challenges Nguyen Trung Kien Hanoi University of Science Melbourne, December 11 th, 2012 High.
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
Infiniband in EDA (Chip Design) Glenn Newell Sr. Staff IT Architect Synopsys.
Clustering In A SAN For High Availability Steve Dalton, President and CEO Gadzoox Networks September 2002.
ATLAS Tier 1 at BNL Overview Bruce G. Gibbard Grid Deployment Board BNL 5-6 September 2006.
1 Cluster Development at Fermilab Don Holmgren All-Hands Meeting Jefferson Lab June 1-2, 2005.
Rick Claus Sr. Technical Evangelist,
Exscale – when will it happen? William Kramer National Center for Supercomputing Applications.
Lenovo - Eficiencia Energética en Sistemas de Supercomputación Miguel Terol Palencia Arquitecto HPC LENOVO.
NICS Update Bruce Loftis 16 December National Institute for Computational Sciences University of Tennessee and ORNL partnership  NICS is the 2.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
© 2007 IBM Corporation Power6 Presentation Power of P6 Anita Devadason June 11 th, 2007.
IBM Power system – HPC solution
Hardware Architecture
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
APE group Many-core platforms and HEP experiments computing XVII SuperB Workshop and Kick-off Meeting Elba, May 29-June 1,
NAMD on BlueWaters Presented by: Eric Bohm Team: Eric Bohm, Chao Mei, Osman Sarood, David Kunzman, Yanhua, Sun, Jim Phillips, John Stone, LV Kale.
Power Systems with POWER8 Technical Sales Skills V1
HPC Roadshow Overview of HPC systems and software available within the LinkSCEEM project.
Joint Genome Institute
Productive Performance Tools for Heterogeneous Parallel Computing
NVIDIA’s Extreme-Scale Computing Project
Appro Xtreme-X Supercomputers
Computing Infrastructure for DAQ, DM and SC
Presentation transcript:

Exascale Evolution 1 Brad Benton, IBM March 15, 2010

Agenda Exascale Challenges On the Path to Exascale: A Look at Blue Waters 2

Exascale Challenges 3

Exascale Challenges Challenges at every level of system design –Managing 500M to 1B (most likely heterogeneous) cores –Programming models to exploit multi-core + accelerators –Interconnect How will IB/RC scale to exascale? How do we “get off the bus”? How can we put more capability in the interconnect –Power Management Power vs. Performance tradeoffs 4

Exascale Challenges Challenges at every level of system design –Resilience/Fault-Tolerance At this scale, something always be broken or in the process of breaking –Development Environment/Performance Tuning –Workflow Management/Process Steering –Data Management/Storage/Visualization 5

Exascale Challenges Resiliency/Fault-Tolerance –F/T Model Fault Detection Fault Isolation Fault Containment Fault Recovery Re-integration –Software Resiliency More than just checkpoint/restart Containers/virtualization suspend/migrate/resume 6

Programming Models MPI –Will it survive in an exascale world? (its demise was predicted at petascale, but seems to be doing okay) Evolve hybrid language models: MPI + “What?” –OpenMP –GPU Accelerators (CUDA, OpenCL) –PGAS languages Greater Exploitation of Autotuning i.e., programs that write progams –ATLAS –FFTW –IBM HPC Toolkit has some of this 7

Title goes here on one line. On the Path to Exascale: A look at Blue Waters 8

NCSA Blue Waters Joint effort between NCSA and University of Illinois First Deliverable of a system based on PERCS technology (2011) Will be the world’s first sustained petascale system for open scientific research for more detailed informationhttp:// 9

Blue Waters Overview Approximately 10 PF/s peak More than 300,000 cores (homogeneous) More than 1 PetaByte memory More than 10 Petabyte disk storage More than 0.5 Exabyte archival storage More than 1 PF/s sustained on scientific applications 10

Building Blue Waters Multi-chip Module 4 Power7 chips 128 GB memory 512 GB/s memory bandwidth 1 TF (peak) Router 1,128 GB/s bandwidth IH Server Node 8 MCM’s (256 cores) 1 TB memory 8 TF (peak) Fully water cooled Blue Waters Building Block 32 IH server nodes 32 TB memory 256 TF (peak) 4 Storage systems 10 Tape drive connections Blue Waters ~1 PF sustained >300,000 cores >1 PB of memory >10 PB of disk storage ~500 PB of archival storage >100 Gbps connectivity Blue Waters is built from components that can also be used to build systems with a wide range of capabilities—from deskside to beyond Blue Waters. Blue Waters will be the most powerful computer in the world for scientific research when it comes on line in Summer of CI Days 22 February 2010 University of Kentucky Power7 Chip 8 cores, 32 threads L1, L2, L3 cache (32 MB) Up to 256 GF (peak) 45 nm technology

Power7 Chip: Computational Heart of Blue Waters Base Technology –45 nm, 576 mm2 –1.2 B transistors Chip –8 cores –12 execution units/core –1, 2, 4 way SMT/core –Up to 4 FMAs/cycle –Caches 32 KB I, D-cache, 256 KB L2/core 32 MB L3 (private/shared) –Dual DDR3 memory controllers 128 GB/s peak memory bandwidth (1/2 byte/flop) –Clock range of 3.5 – 4 GHz Quad-chip MCM Power7 Chip 12

High-End Server Resilience 13

Feeds and Speeds per MCM 32 cores 8 Flop/cycle per core 4 threads per core max 3.5 – 4 GHz 1 TF/s 32 MB L3 512 GB/s memory BW (0.5 Byte/flop) 800 W (0.8 W/flop) 14

First Level Interconnect  L-Local  HUB to HUB Copper Wiring  256 Cores ONE DRAWER 8 MCMs, 32 chips, 256 cores 15

Interconnect: 1.1 TB/s HUB 192 GB/s Host Connection 336 GB/s to 7 other local nodes in the same drawer 240 GB/s to local-remote nodes in the same supernode (4 drawers) 320 GB/s to remote nodes 40 GB/s to general purpose I/O 16

17

Second Level Interconnect  Optical ‘L-Remote’ Links from HUB  Construct Super Node (4 CECs)  1,024 Cores  Super Node ONE SUPERNODE 4 drawers, 32 MCMs, 128 chips, 1024 cores 18

BPA  200 to 480Vac  370 to 575Vdc  Redundant Power  Direct Site Power Feed  PDU Elimination WCU  Facility Water Input  100% Heat to Water  Redundant Cooling  CRAH Eliminated Storage Unit  4U  0-6 / Rack  Up To 384 SFF DASD / Unit  File System CECs  2U  1-12 CECs/Rack  256 Cores  128 SN DIMM Slots / CEC  8,16, (32) GB DIMMs  17 PCI-e Slots  Imbedded Switch  Redundant DCA  NW Fabric  Up to:3072 cores, 24.6TB (49.2TB) Rack  990.6w x d x  39”w x 72”d x 83”h  ~2948kg (~6500lbs) Rack Components Compute Storage Switch 100% Cooling PDU Eliminated Input: 8 Water Lines, 4 Power Cords Out: ~100TFLOPs / 24.6TB / 153.5TB 192 PCI-e 16x / 12 PCI-e 8x 19

How does this affect OFA? Blue Waters can connect externally via PCIe devices (e.g., InfiniBand) as needed Blue Waters interconnect –Is RDMA based –Is not InfiniBand (or iWARP or RoCEE) –Hardware support for Global Shared Memory Pendulum is swinging back to proprietary interconnects (at least at IBM) Is there a path to OFA compatibility? –how can/should OFA accept/support new/different RDMA interconnects? –how can/should IBM work w/OFA for embracing new interconnect technologies? 20

Exascale Evolution Technical Evolution is not always in a straight line Different technologies evolve at different times and rates e.g., Blue Waters is not a direct descendent of RoadRunner/Cell, but rather of POWER/Federation/SP To reach exascale levels will require the consolidation and continued evolution of multiple technologies 21