AMS 290B Advanced Topics in the Numerical Solution of PDEs Topic 2008: An Introduction to Parallel Computation and Large CFD Codes Nic Brummell

Slides:



Advertisements
Similar presentations
Parallel Programming and Algorithms : A Primer Kishore Kothapalli IIIT-H Workshop on Multi-core Technologies International Institute.
Advertisements

1 Computational models of the physical world Cortical bone Trabecular bone.
Program Analysis and Tuning The German High Performance Computing Centre for Climate and Earth System Research Panagiotis Adamidis.
Instructor: Sazid Zaman Khan Lecturer, Department of Computer Science and Engineering, IIUC.
Commodity Computing Clusters - next generation supercomputers? Paweł Pisarczyk, ATM S. A.
Blue brain Copyright © cs-tutorial.com.
ICS 556 Parallel Algorithms Ebrahim Malalla Office: Bldg 22, Room
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
CS 240A Applied Parallel Computing John R. Gilbert Thanks to Kathy Yelick and Jim Demmel at UCB for.
Introduction CS 524 – High-Performance Computing.
Supercomputers Daniel Shin CS 147, Section 1 April 29, 2010.
Parallel Programming Henri Bal Vrije Universiteit Faculty of Sciences Amsterdam.
Parallel Computing Overview CS 524 – High-Performance Computing.
An Introduction to Princeton’s New Computing Resources: IBM Blue Gene, SGI Altix, and Dell Beowulf Cluster PICASso Mini-Course October 18, 2006 Curt Hillegas.
Chapter 1. Introduction This course is all about how computers work But what do we mean by a computer? –Different types: desktop, servers, embedded devices.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE Computer Organization Lecture 2 - Technology.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
PARALLEL PROCESSING The NAS Parallel Benchmarks Daniel Gross Chen Haiout.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.
Parallel Programming Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Chapter 1 Sections 1.1 – 1.3 Dr. Iyad F. Jafar Introduction.
Chapter 4  Converts data into information  Control center  Set of electronic circuitry that executes stored program instructions  Two parts ◦ Control.
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
Computer Architecture CST 250 INTEL PENTIUM PROCESSOR Prepared by:Omar Hirzallah.
+ CS 325: CS Hardware and Software Organization and Architecture Introduction.
(1.1) Coen 001 Understanding Digital Technologies Ron Danielson Fall 2000.
1 CHAPTER 2 COMPUTER HARDWARE. 2 The Significance of Hardware  Pace of hardware development is extremely fast. Keeping up requires a basic understanding.
Information and Communication Technology Fundamentals Credits Hours: 2+1 Instructor: Ayesha Bint Saleem.
Processor and Internal Stuff or the “guts” of the computer.
Introduction Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2009.
CCSE251 Introduction to Computer Organization
Lappeenranta University of Technology / JP CT30A7001 Concurrent and Parallel Computing Introduction to concurrent and parallel computing.
Inside the CPU COSC-100 (Elements of Computer Science) Prof. Juola.
9/21/2015Copyright Introduction CEE3804: Computer Applications for Civil and Environmental Engineers.
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
Introduction Computer Organization and Architecture: Lesson 1.
Led the WWII research group that broke the code for the Enigma machine proposed a simple abstract universal machine model for defining computability devised.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
Parallel Processing Sharing the load. Inside a Processor Chip in Package Circuits Primarily Crystalline Silicon 1 mm – 25 mm on a side 100 million to.
1 The First Computer [Adapted from Copyright 1996 UCB]
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Introduction.  This course is all about how computers work  But what do we mean by a computer?  Different types: desktop, servers, embedded devices.
CS591x -Cluster Computing and Parallel Programming
Morgan Kaufmann Publishers
1 THE EARTH SIMULATOR SYSTEM By: Shinichi HABATA, Mitsuo YOKOKAWA, Shigemune KITAWAKI Presented by: Anisha Thonour.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Chapter VI What should I know about the sizes and speeds of computers?
Operating Systems COT 4600 – Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu, Th 3:00-4:00 PM.
Presented by NCCS Hardware Jim Rogers Director of Operations National Center for Computational Sciences.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
Visualization in Scientific Computing (or Scientific Visualization) Multiresolution,...
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Computer Organization IS F242. Course Objective It aims at understanding and appreciating the computing system’s functional components, their characteristics,
VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.
1 Week 1: The History of Computing (PART II) READING: Chapter 1.
William Stallings Computer Organization and Architecture 6th Edition
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
VLSI Tarik Booker.
BIC 10503: COMPUTER ARCHITECTURE
Parallel Processing Sharing the load.
CSC Classes Required for TCC CS Degree
CMSC 611: Advanced Computer Architecture
Course Description: Parallel Computer Architecture
CMSC 611: Advanced Computer Architecture
Vrije Universiteit Amsterdam
Presentation transcript:

AMS 290B Advanced Topics in the Numerical Solution of PDEs Topic 2008: An Introduction to Parallel Computation and Large CFD Codes Nic Brummell (831) Applied Math & Statistics (AMS) University of California, Santa Cruz

Maybe we want to model something complicated …

Global simulations: Differential rotation/Dynamo Brun, Brown, Browning, Clune, Elliot, Miesch, Toomre (University of Colorado) Largest global simulations in the world – spherical harmonic degree l ~ O(1000) Problem: resolves from largest scale down => diffuses smallest scale ~ supergranules

Local simulations: Convection zone and tachocline

Local simulations: Penetrative convection movie Brummell, Clune, Toomre

Local simulations: Magnetic transport? Tobias, Brummell, Clune, Toomre

Magnetic buoyancy?: 1D shear (Grad student: Geoff Vasil ) Success? But this was all very laminar. How about a much more turbulent case? Success? Much harder to produce clean rising tubes. Magnetic field feeds back strongly on shear that manufactures field.

Side viewTop view 3D simulation (Cattaneo, 1999)

3D simulation - Cattaneo, 1999 Temp flucs near surface top B middle ->

Cost! One simulation costs 60,000 supercomputing hours. = 2500 days = 83 months = 7 years! How come we’ve done so many? Are we that old? NO! NO! (don’t be cheeky) PARALLEL COMPUTERS PARALLEL COMPUTERS => 60,000 supercomputing hours can be done in 1 wall clock hour on 60,000 processors (or 120 wall clock hours = 5 days on 500 processors)

Computing power steadily increases! Moore’s Law: (Gordon E. Moore, Intel co-founder, “Electronics Magazine”, 1965) “Number of transistors that can be placed cheaply on an integrated circuit board will double every two years” ~ computing power doubles every two years = exponential rate of increase! Can link almost any computational capability to Moore’s Law: Processing speed Memory capacity Resolution of digital cameras!

Computing power steadily increases! Pentium Pentium II Pentium III Itanium 2 Itanium 2+cache Year 286 Intel processors Double every 2 years Double every 18 months No of transistos ,000,000

Computing power steadily increases! However … 1. Exponentially improving hardware does not mean exponentially improving software to go with it! Wirth’s Law: “Software gets slower faster than hardware gets faster” Software must get larger and more complex = slower! 2.Moore’s Law can’t go on forever (even Moore acknowledged this). Technological singularity?? Eventually transistors will become size of an atom. Krauss & Starkman: 600 years. 3.BUT new technology come along and replace integrated circuits? (R. Kurzweil: “The Law of Accelerating Returns”)

Computing power steadily increases!

SpecificationCray-1AIBM POWER5+Approx Ratio Year installed ArchitectureVector processorDual Cluster of scalar CPUs Number of CPUs 1~5,0005,000:1 Clock Speed12.5 nsec (80 MHz)0.53 nsec (1.9 GHz)24:1 Peak perf per CPU160 MFLOPS7.6 GFLOPS48:1 Peak perf per system 160 MFLOPS~38 TFLOPS250,000:1 Sustained performance ~50 MFLOPS~4 TFLOPS80,000:1 Memory8 Mbytes~10 Tbytes1,000,000:1 Disk Space2.5 Gbytes~100 Tbytes40,000:1

Computing power steadily increases!

Technology over the years: 1942 ENIAC kops

Technology over the years: 1947 Whirlwind 3

Technology over the years: 1951 Univac 250 kops

Technology over the years: IBM 1950s IBM 650 IBM 701 IBM 704

Technology over the years: 1964 IBM 360

Technology over the years: 1965 CDC Mflops

Technology over the years: 1976 Cray Mflops

Technology over the years: 1978 VAX

Technology over the years: 1982 Cray X-MP 941 Mflops

Technology over the years: 1985 TM CM-1 1 Gflops

Technology over the years: 1990 NEC SX2/3 23 Gflops

Technology over the years: 1993 Cray T3D 1993 also: TM CM-5 ~ 65 Gflops, Intel Paragon ~ 143 Gflops

Technology over the years: 2002 NEC ESS Fastestin world until 2004 Holistic simulations of global earth climate and oceans down to 10km resolution 5120 processors 35 Tflops

Technology over the years: 2007 IBM Blue Gene/L Fastest machine in the world! ~ 213,000 cpus 596 Tflops peak, 478 sustained

Technology over the years: 2007 Cray XT5

What is the point of this course? Learn how to harness all this computing power! In particular, learn how to program MASSIVELY PARALLEL MACHINES Elusive mixture of theory and practice … Highly interdisciplinary! These days need: Engineering skills: architecture, hardware, network knowledge Mathematical skills: applied math methods, numerical analysis Computer science skills: algorithm design, software engineering, compilers, operating systems, ftp, … Scientific skills: design of problem, setup of models/equations, analysis of results Artistic skills: visualisation, movies, …

The syllabus PART A: CONCEPTS oIntro to Parallel Computing Parallel machine models Parallel programming models oDesigning Parallel algorithms Partitioning, communication, agglomeration, mapping Performance analysis PART B: TOOLS  MPI  OpenMP  Debugging, Performance, Analysis (IDL, VAPOR) PART C: CASE STUDIES  Spectral Methods (HPS)  Finite-Volume PART D: SPECIAL TOPICS IN TURBULENCE  DNS vs LES/SGS  AMR

Expectations Course is BRAND NEW = an EXPERIMENT! Materials are “in preparation” (sorry, guinea pigs!) Expectations of the student: oLearn something useful! oThere will be exercises and tasks oBut very seldom “right” and “wrong” answers oOften I won’t know the answer! oThe expectation is only that the student try and make use of what they’ve learned.

Baseline TEST!