Parallel Computing and Parallel Computers

Slides:



Advertisements
Similar presentations
CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Partitioning and Divide-and-Conquer Strategies ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 23, 2013.
Potential for parallel computers/parallel programming
1 Potential for Parallel Computation Module 2. 2 Potential for Parallelism Much trivially parallel computing  Independent data, accounts  Nothing to.
Chapter 1 Parallel Computers.
Parallel Computers Chapter 1
COMPE 462 Parallel Computing
11Sahalu JunaiduICS 573: High Performance Computing5.1 Analytical Modeling of Parallel Programs Sources of Overhead in Parallel Programs Performance Metrics.
1 Lecture 4 Analytical Modeling of Parallel Programs Parallel Computing Fall 2008.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming with MPI and OpenMP Michael J. Quinn.
Lecture 5 Today’s Topics and Learning Objectives Quinn Chapter 7 Predict performance of parallel programs Understand barriers to higher performance.
1 Section 2.3 Complexity of Algorithms. 2 Computational Complexity Measure of algorithm efficiency in terms of: –Time: how long it takes computer to solve.
1a-1.1 Parallel Computing Demand for High Performance ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson Dec 27, 2012 slides1a-1.
Computer Science 320 Measuring Speedup. What Is Running Time? T(N, K) says that the running time T is a function of the problem size N and the number.
CS 420 Design of Algorithms Analytical Models of Parallel Algorithms.
Performance Evaluation of Parallel Processing. Why Performance?
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.
Parallel Computing and Parallel Computers. Home work assignment 1. Write few paragraphs (max two page) about yourself. Currently what going on in your.
Grid Computing, B. Wilkinson, 20047a.1 Computational Grids.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
Lecture 9 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
CSCI-455/522 Introduction to High Performance Computing Lecture 1.
Parallel Programming with MPI and OpenMP
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Introduction. Demand for Computational Speed Continual demand for greater computational speed from a computer system than is currently possible Areas.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
1a.1 Parallel Computing and Parallel Computers ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.
1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,
Distributed and Parallel Processing George Wells.
Parallel Computers Chapter 1.
Potential for parallel computers/parallel programming
PERFORMANCE EVALUATIONS
Parallel Computing Demand for High Performance
Parallel Computers.
Objective of This Course
Parallel Computing Demand for High Performance
Winter 2018 CISC101 12/2/2018 CISC101 Reminders
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
Quiz Questions Parallel Programming Parallel Computing Potential
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Parallel Computing Demand for High Performance
Parallelismo.
COMP60621 Designing for Parallelism
By Brandon, Ben, and Lee Parallel Computing.
Parallel Computing Demand for High Performance
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.
Performance Cycle time of a computer CPU speed speed = 1 / cycle time
PERFORMANCE MEASURES. COMPUTATIONAL MODELS Equal Duration Model:  It is assumed that a given task can be divided into n equal subtasks, each of which.
Parallelism and Amdahl's Law
Parallel Computing and Parallel Computers
Potential for parallel computers/parallel programming
Potential for parallel computers/parallel programming
Parallel Speedup.
Potential for parallel computers/parallel programming
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Quiz Questions Parallel Programming Parallel Computing Potential
Vrije Universiteit Amsterdam
Potential for parallel computers/parallel programming
Presentation transcript:

Parallel Computing and Parallel Computers ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, 2009.

Parallel Computing Motives Using more than one computer, or a computer with more than one processor, to solve a problem. Motives Usually faster computation. Very simple idea n computers operating simultaneously can achieve the result faster it will not be n times faster for various reasons Other motives include: fault tolerance, larger amount of memory available, ...

Demand for Computational Speed Continual demand for greater computational speed from a computer system than is currently possible Areas requiring great computational speed include: Numerical modeling Simulation of scientific and engineering problems. Computations need to be completed within a “reasonable” time period.

Grand Challenge Problems Ones that cannot be solved in a reasonable amount of time with today’s computers. Obviously, an execution time of 10 years is always unreasonable. Grand Challenge Problem Examples Modeling large DNA structures Global weather forecasting Modeling motion of astronomical bodies.

Weather Forecasting Atmosphere modeled by dividing it into 3-dimensional cells. Calculations of each cell repeated many times to model passage of time. Temperature, pressure, humidity, etc.

Global Weather Forecasting Example Suppose whole global atmosphere divided into cells of size 1 mile  1 mile  1 mile to a height of 10 miles (10 cells high) - about 5  108 cells. Suppose each calculation requires 200 floating point operations. In one time step, 1011 floating point operations necessary. To forecast the weather over 7 days using 1-minute intervals, a computer operating at 1Gflops (109 floating point operations/s) takes 106 seconds or over 10 days. To perform calculation in 5 minutes requires computer operating at 3.4 Tflops (3.4  1012 floating point operations/sec) Needs to be 34,000 faster.

Modeling Motion of Astronomical Bodies Each body attracted to each other body by gravitational forces. Movement of each body predicted by calculating total force on each body.

Modeling Motion of Astronomical Bodies With N bodies, N - 1 forces to calculate for each body, or approx. N2 calculations There is an O(N log2 N) algorithm we will cover for an efficient close solution. After determining new positions of bodies, calculations repeated.

A galaxy might have, say, 1011 stars. Even if each calculation done in 1 ms (extremely optimistic figure), it takes: 109 years for one iteration using N2 algorithm or Almost a year for one iteration using the N log2 N algorithm assuming the calculations take the same time (which may not be true).

Astrophysical N-body simulation by Scott Linssen (undergraduate UNC-Charlotte student).

Parallel programming – programming parallel computers Has been around for more than 50 years.

Gill writes in 1958: “... There is therefore nothing new in the idea of parallel programming, but its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation ...” Gill, S. (1958), “Parallel Programming,” The Computer Journal, vol. 1, April, pp. 2-10.

Potential for parallel computers/parallel programming

Speedup Factor ts S(p) = = tp Execution time using one processor (best sequential algorithm) S(p) = = tp Execution time using a multiprocessor with p processors where ts is execution time on a single processor and tp is execution time on a multiprocessor. S(p) gives increase in speed by using multiprocessor. Use best sequential algorithm with single processor system. Underlying algorithm for parallel implementation might be (and is usually) different.

Speedup factor can also be cast in terms of computational steps: Can also extend time complexity to parallel computations. Number of computational steps using one processor S(p) = Number of parallel computational steps with p processors

Maximum Speedup Maximum speedup usually p with p processors (linear speedup). Possible to get superlinear speedup (greater than p) but usually a specific reason such as: Extra memory in multiprocessor system Nondeterministic algorithm

Maximum Speedup Amdahl’s law t s ft (1 - f ) t s s Serial section Parallelizable sections (a) One processor (b) Multiple processors p processors (1 - f ) t / p t s p

Speedup factor is given by: This equation is known as Amdahl’s law

Speedup against number of processors 4 8 12 16 20 f = 20% = 10% = 5% = 0% Number of processors , p

Even with infinite number of processors, maximum speedup limited to 1/f. Example With only 5% of computation being serial, maximum speedup is 20, irrespective of number of processors.

Superlinear Speedup example - Searching (a) Searching each sub-space sequentially t s /p Start Time D Solution found x Sub-space search indeterminate

(b) Searching each sub-space in parallel Solution found D t (b) Searching each sub-space in parallel

Speed-up then given by t x ´ s + D t p S(p) = D t

p – 1 ´ t + D t s p S(p) = ® ¥ D t as D t tends to zero Worst case for sequential search when solution found in last sub-space search. Then parallel version offers greatest benefit, i.e. p – 1 ´ t + D t s p S(p) = ® ¥ D t as D t tends to zero   

Least advantage for parallel version when solution found in first sub-space search of the sequential search, i.e. Actual speed-up depends upon which subspace holds solution but could be extremely large. D t S(p) = = 1 D t

Next question to answer is how does one construct a computer system with multiple processors to achieve the speed-up?