Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to High Performance Cluster Computing Courseware Module H.1.a August 2008.

Similar presentations


Presentation on theme: "Introduction to High Performance Cluster Computing Courseware Module H.1.a August 2008."— Presentation transcript:

1 Introduction to High Performance Cluster Computing Courseware Module H.1.a August 2008

2 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. What is HPC  HPC = High Performance Computing Includes Supercomputing  HPCC = High Performance Cluster Computing Note: these are NOT High Availability clusters  HPTC = High Performance Technical Computing  The ultimate aim of HPC users is to max out the CPUs!

3 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Agenda Parallel Computing Concepts Clusters Cluster Usage

4 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Concurrency and Parallel Computing A central concept in computer science is concurrency: Concurrency: Computing in which multiple tasks are active at the same time. There are many ways to use Concurrency: Concurrency is key to all modern Operating Systems as a way to hide latencies. Concurrency can be used together with redundancy to provide high availability. Parallel Computing uses concurrency to decrease program runtimes. HPC systems are based on Parallel Computing

5 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Hardware for Parallel Computing Parallel computers are classified in terms of streams of data and streams of instructions: MIMD Computers: Multiple streams of instructions acting on multiple streams of data. SIMD Computers: A single stream of instructions acting on multiple streams of data. Parallel Hardware comes in many forms: On chip: Instruction level parallelism (e.g. IPF) Multi-core: Multiple execution cores inside a single CPU Multiprocessor: Multiple processors inside a single computer. Multi-computer: networks of computers working together.

6 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Hardware for Parallel Computing Distributed Computing Cluster Massively Parallel Processor (MPP) Symmetric Multiprocessor (SMP) Non-uniform Memory Architecture (NUMA) Single Instruction Multiple Data (SIMD)* Multiple Instruction Multiple Data (MIMD) Shared Address Space Disjoint Address Space Parallel Computers

7 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. HPC Platform Generations In the 1990’s, it was a massively parallel computer. Commodity Off The Shelf CPUs, everything else custom … but today, it is a cluster. COTS components everywhere In the 1980’s, it was a vector SMP. Custom components throughout Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries.

8 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. What is an HPC Cluster A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers cooperatively working together as a single, integrated computing resource. A typical cluster uses: Commodity off the shelf parts Low latency communication protocols

9 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. What is HPCC? ClusterManagementTools Master Node File Server / Gateway Compute Nodes LAN/WAN Interconnect

10 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. A Sample Cluster Design

11 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Cluster Architecture View desktop shmem Other OSes Parallel Benchmarks: Perf, Ring, HINT, NAS, … Ethernet TCP/IP Real Applications HardwareHardware InterconnectInterconnect ProtocolProtocol OSOS MiddlewareMiddleware ApplicationApplication Workstation Server Server 4U + Proprietary MPI Server1P/2P Linux PVM Myrinet VIA InfinibandQuadrics

12 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Cluster Hardware The Node A single element within the cluster Compute Node Just computes – little else Private IP address – no user access Master/Head/Front End Node User login Job scheduler Public IP address – connects to external network Management/Administrator Node Systems/cluster management functions Secure administrator address I/O Node Access to data Generally internal to cluster or to data centre

13 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Interconnect InterconnectTypical Latency usec Typical Bandwidth MB/s 100 Mbps Ethernet758 1Gbit/s Ethernet Gb/s Ethernet SCI* Myricom Myrinet* InfiniBand* Quadrics QsNet*

14 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Agenda Parallel Computing Concepts Clusters Cluster Usage

15 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Cluster Usage Performance Measurements Usage Model Application Classification Application Behaviour

16 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. The Mysterious FLOPS 1 GFlops = 1 billion floating point operations per second Theoretical v Real GFlops Xeon Processor 1 Core theoretical peak = 4 x Clock speed (double precision) Xeons have 128 bit SSE registers which allows the processor to carry out 2 double precision floating point add and 2 multiply operations per clock cycle 2 computational cores per processor 2 processors per node (4 cores per node) Sustained (Rmax) = ~35-80% of theoretical peak (interconnect dependent) You’ll NEVER hit peak!

17 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Other Measures of CPU Performance  SPEC (www.spec.org) –Spec CPU2000/2006 Speed – single core performance indicator –Spec CPU2000/2006 Rate – node performance indicator –SpecFP – Floating Point performance –SpecINT – Integer performance  Many other performance metrics may be required –STREAM - memory bandwidth –HPL – High Performance Linpack –NPB – NASA suite of performance tests –Pallas Parallel Benchmark – another suite –IOZone – file system throughput

18 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Technology Advancements in 5 Years CodenameRelease date GHzNumber of cores Peak FLOP per CPU cycle Peak GFLOPS per CPU Linpack on 256 Processors FosterSeptember * WoodcrestJune **  Example:  * From November 2001 top500 supercomputer list (cluster of Dell Precision 530)  ** Intel internal cluster built in 2006

19 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Usage Model Many Serial Jobs (Capacity) One Big Parallel Job (Capability) Load Balancing More Important Job Scheduling very important Interconnect More Important Normal Mixed Usage Electronic Design Monte Carlo Design Optimisation Parallel Search Many Users Mixed size Parallel/Serial jobs Ability to Partition and Allocate Jobs to Nodes for Best Performance Batch Usage Appliance Usage Meteorology Seismic Analysis Fluid Dynamics Molecular Chemistry

20 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Application and Usage Model HPC clusters run parallel applications, and applications in parallel! One single application that takes advantage of multiple computing platforms Fine-Grained Application Uses many systems to run one application Shares data heavily across systems PDVR3D (Eigenvalues and Eigenstates of a matrix) Coarse-Grained Application Uses many systems to run one application Infrequent data sharing among systems Casino (Monte-Carlo stochastic methods) Embarrassingly Parallel Application An instance of the entire application runs on each node Little or no data sharing among compute nodes BLAST (pattern matching) A shared memory machine will run all sorts of application

21 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Types of Applications Forward Modelling Inversion Signal Processing Searching/Comparing

22 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Forward Modelling Solving linear equations Grid Based Parallelization by domain decomposition (split and distribute the data) Finite element/finite difference

23 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. From measurements (F) compute models (M) representing properties (d) of the measured object(s). Deterministic Matrix inversions Conjugate gradient Stochastic Monte Carlo, Markov chain Genetic algorithms Generally large amounts of shared memory Parallelism through multiple runs with different models Inversion

24 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Signal Processing/Quantum Mechanics Convolution model (stencil) Matrix computations (eigenvalues…) Conjugate gradient methods Normally not very demanding on latency and bandwidth Some algorithms are embarrassingly parallel Examples: seismic migration/processing, medical imaging,

25 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Signal Processing Example

26 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Searching/Comparing Integer operations are more dominant than floating point IO intensive Pattern matching Embarrassingly parallel – very suitable for grid computing Examples: encryption/decryption, message interception, bio- informatics, data mining Examples: BLAST, HMMER

27 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Application Classes Applications FEA – Finite Element Analysis The simulation of hard physical materials, e.g. metal, plastic Crash test, product design, suitability for purpose Examples: MSC Nastran, Ansys, LS-Dyna, Abaqus, ESI PAMCrash, Radioss CFD – Computational Fluid Dynamics The simulation of soft physical materials, gases and fluids Engine design, airflow, oil reservoir modelling Examples: Fluent, Star-CD, CFX Geophysical Sciences Seismic Imaging – taking echo traces and building a picture of the sub-earth geology Reservoir Simulation – CFD specific to oil asset management Examples: Omega, Landmark VIP and Pro/Max, Geoquest Eclipse

28 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners. Application Classes Applications Life Sciences Understanding the living world – genome matching, protein folding, drug design, bio-informatics, organic chemistry Examples: BLAST, Gaussian, other High Energy Physics Understanding the atomic and sub-atomic world Software from Fermi-Lab or CERN, or home-grown Financial Modelling Meeting internal and external financial targets particularly regarding investment positions VaR – Value at Risk – assessing the impact of economic and political factors on the bank’s investment portfolio Trader Risk Analysis – what is the risk on a trader’s position, a group of traders


Download ppt "Introduction to High Performance Cluster Computing Courseware Module H.1.a August 2008."

Similar presentations


Ads by Google