Download presentation
Presentation is loading. Please wait.
Published byLester Gibson Modified over 9 years ago
1
SNU OOPSLA Lab. 1 Great Ideas of CS with Java Part 1 WWW & Computer programming in the language Java Ch 1: The World Wide Web Ch 2: Watch out: Here comes Java Ch 3: Numerical computation & Function Ch 4: Subroutines & Databases Ch 5: Graphics Ch 6: Simulation Ch 7: Software engineering Part 2 Understanding what a computer is and how it works Ch 8: Machine architecture Ch 9: Language translation Ch 10: Virtual Environment for Computing Ch 11: Security, Privacy, and Wishful thinking Ch 12: Computer Communication Part 3 Advanced topics Ch 13: Program Execution Time Ch 14: Parallel Computation Ch 15: Noncomputability Ch 16: Artificial intelligence
2
SNU OOPSLA Lab. Ch14. Parallel Computation Copyright © 2002 SNU OOPSLA Lab.
3
SNU OOPSLA Lab. 3 Textbook: Table of Contents Motivation behind Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
4
SNU OOPSLA Lab. 4 Demand for Computational Speed Continual demand for greater computational speed from a computer system than is currently possible Areas requiring great computational speed include numerical modeling and simulation of scientific and engineering problems. Computations must be completed within a “reasonable” time period.
5
SNU OOPSLA Lab. 5 Grand Challenge Problems One that cannot be solved in a reasonable amount of time with today’s computers. Obviously, an execution time of 10 years is always unreasonable. Examples Modeling large DNA structures Global weather forecasting Modeling motion of astronomical bodies
6
SNU OOPSLA Lab. 6 SNU OOPSLA Lab. http://www.oopsla.snu.ac.kr DNA Research 인체 설계도를 낱낱이 규명 1 개의 염색체 : 수 천개의 유전자 인체 : 60-100 조 개의 세포 방대한 데이터 1 개의 세포 : 2 개의 게놈 (46 개의 염색체 ) 유전자 : ACGT
7
SNU OOPSLA Lab. 7 Global Weather Forecasting Suppose whole global atmosphere divided into cells of size 1 mile 1 mile 1 mile to a height of 10 miles (10 cells high) - about 5 10 8 cells. Suppose each calculation requires 200 floating point operations. In one time step, 10 11 floating point operations necessary. To forecast the weather over 7 days using 1-minute intervals, a computer operating at 1Gflops (10 9 floating point operations/sec) takes 10 6 seconds or over 10 days. Mega flops = 10 6 floating point operations/sec Giga flops = 10 3 Mega flops Tera flops = 10 3 Giga flops
8
SNU OOPSLA Lab. 8 Modeling Motion of Astronomical Bodies Each body attracted to each other body by gravitational forces. Movement of each body predicted by calculating total force on each body. With N bodies, N - 1 forces to calculate for each body, or approx. N 2 calculations. (N*log 2 N for an efficient approx. algorithm.) After determining new positions of bodies, calculations repeated. A galaxy might have, say, 10 11 stars. Even if each calculation done in 1 ms (extremely optimistic figure), it takes 10 9 years for one iteration using N 2 algorithm and almost a year for one iteration using an efficient N*log 2 N approximate algorithm
9
SNU OOPSLA Lab. 9 Astrophysical N-body simulation
10
SNU OOPSLA Lab. 10 Parallel Computing Using more than one computer, or a computer with more than one processor, to solve a problem. Motives Usually faster computation - very simple idea - that n computers operating simultaneously can achieve the result n times faster - it will not be n times faster for various reasons. Other motives include: fault tolerance, larger amount of memory available,...
11
SNU OOPSLA Lab. 11 Long History, but ….. Parallel computers (computers with more than one processor) and their programming (parallel programming ) has been around for more than 40 years. “... There is therefore nothing new in the idea of parallel programming, but its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation...” Gill, S. (1958), “Parallel Programming,” The Computer Journal, vol. 1,pp. 2-10.
12
SNU OOPSLA Lab. 12 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
13
SNU OOPSLA Lab. 13 Limitations on Processor Speed Electricity cannot travel go faster than the speed of light 1 foot = 1 nano-second Make Smaller Heat Dissipation Memory Size More Limitations on Processor Speed Manufacturing Problems with Small Sizes Feature size < wavelength of light UV, X-ray Got to lower voltages Ultimately Parallelism is Only Hope
14
SNU OOPSLA Lab. 14 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
15
SNU OOPSLA Lab. 15 Forms of Parallelism Pipe Line Assembly Line for Instructions Multiprocessors Networks of Processors Internet (Networked computers)
16
SNU OOPSLA Lab. 16 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
17
SNU OOPSLA Lab. 17 Parallel Computation (1/8) Parallel computation requires a parallel computer Ex) 100 machines placed in row Figure 14.1
18
SNU OOPSLA Lab. 18 Parallel Computation (2/8) Solving the problem of finding the names of all individuals with a given height and weight Assume the number of individuals n is 100 or fewer (one processor can be allocated to each) ParallelPersonSearch.java A copy of this code is loaded into each processor Each processor has its own copy of ‘targetheight’ and ‘targetweight’ Each processor holds the height, weight, and name of a particular person
19
SNU OOPSLA Lab. 19 Public class ParallelPersonSearch { double targetweight, targetheight, weight, height; String name; public void init() { // put code here to enter data into weight, height, and // name for a single individual // processor 0 prints the message “ Give the target height ” processor 0 output.setText( “ Give the target height. ” ); // all processors receive the target height targetheight = all input.getDouble(); // processor 0 prints the message “ Give the target weight ” processor 0 output.setText( “ Give the target weight. ” ); // all processors receive the target weight targetweight = all input.getDouble(); // each processor compares the target values with the values it stores // for one person, if they matching, it outputs that person ’ s name if((targetheight == height) && (targetweight == weight)) this processor output.setText(name); } ParallelPersonSearch.java
20
SNU OOPSLA Lab. 20 Parallel Computation (3/8) This parallel computation is much faster than in the sequential case of Ch. 13 T sequential = 5.1 * 10 -4 * n T parallel = 5.1 * 10 -4 * 1 However, these results pertain to a computation where there is a separate processor for each person
21
SNU OOPSLA Lab. 21 Figure 14.2 Parallel Computation (4/8) Tsequential = 5.1 * 10**(- 4) * n Tparallel = 5.1 * 10**(- 4) * 1
22
SNU OOPSLA Lab. 22 Parallel Computation (5/8) Another example: Traveling Salesperson Problem Each processor compute a different ordering for the cities and the length of the related path. Then, the different paths computed by all processors are compared, and the best one is selected TSP.java
23
SNU OOPSLA Lab. 23 Public class TSP { int [] permarray; double [][] distance; int n, i, homecity, currentcity; double sum; public void init() { // Input n and array of all n cities and their distance. Initialize arrays, etc. // procnum contains the number of the current processor and // findperm function decides which permutation the current processor is to explore permarray = findperm(procnum); sum = 0; i = 0; homecity = permarray[0]; currentcity = homecity; while(i < n-1) { sum = sum + distance[currentcity][permarray[i+1]]; currentcity = permarray[i+1]; i = i + 1; } sum = sum + distance[currentcity][homecity]; // Send sum to a comparator processor that works with other processors // to decide the minimum sum and its associated permutation } TSP.java
24
SNU OOPSLA Lab. 24 Parallel Computation (6/8) Approximate execution time for calculating one path T = 5 * 10 -3 * n (n: the number of cities) If we have enough processors so that every possible path is computed on one of them, T parallel = 5 * 10 -3 * n On a single-processor machine T sequential = 4.6 * 10 -3 * n!
25
SNU OOPSLA Lab. 25 Figure 14.3 Parallel Computation (7/8) Tparallel = 5 * 10-3 * n Tsequential = 4.6 * 10-3 * n!
26
SNU OOPSLA Lab. 26 Parallel Computation (8/8) When parallel computation is used for 22 cities, T parallel = 5 * 10 -3 * 22 = 0.11 second Sequential machine would not finish the calculation before the sun burns out However, parallel machine requires (n-1)!/2 processors In this example, approximately 10 19 processors
27
SNU OOPSLA Lab. 27 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
28
SNU OOPSLA Lab. 28 Communicating Processes (1/8) The problems of the previous section A set of processors were divided in a simple way More difficult problem The sorting of integers distributed across our 100-processor machine We need communication between the processes and the ability to pass the integers up and down the line Let’s assume there are n integers located in processor 0 through n-1, where n is 100 or less Each processor has num containing one of the numbers in the list, n, an index i, and its own processor number procnum
29
SNU OOPSLA Lab. 29 Figure 14.4 Communicating Processes (2/8)
30
SNU OOPSLA Lab. 30 Communicating Processes (3/8) More difficult problem Each processor except the first examine the number in the processor to its left If the other number is larger than its own, it exchanges them This operation is repeated with n repetitions Numbers should become sorted with the lowest in processor 0 and the largest in processor n-1 Here is the program ParallelSort.java
31
SNU OOPSLA Lab. 31 public class ParallelSort { int i, num, n, procnum; public void init() { // put code here to read num and n if((procnum > 0) && (procnum <= n-1)) { i = 1; while(i <= n) { if(num(left neighbor) > num(this processor)) { exchange num(left neighbor) and num(this processor); } i = i + 1; } ParallelSort.java
32
SNU OOPSLA Lab. 32 Communicating Processes (4/8) Example: n=3, first three processors contain 6, 5, and 4 Num 6 5 4 Procnum 00 01 10 Num 5 ? 5 Procnum 00 01 10 Processor01 will put 5 into processor00 and 6 into itself Processor10 will put 4 into processor01 and 5 into itself Error!, processor01 has both 6 and 4 loaded into its num location Solution: A set of flags called semaphores
33
SNU OOPSLA Lab. 33 Communicating Processes (5/8) Semaphores Let’s put a flag with each num location Flag contains a value of 0 or 1 Rules Each processor is able to access and change its own num only if its flag is 1 If its flag is 0, it is not allowed to affect its own num because that location is controlled by its right neighbor For a processor to exchange its num value with its left neighbor’s, its flag =1 and its left neighbor’s flag = 0
34
SNU OOPSLA Lab. 34 Communicating Processes (6/8) Semaphores For processor i to do an exchange Num 5 4 Flag 0 1 Procnum i-1 i Activity waiting on After a process completes a cycle, it should change its own flag and its left neighbor’s to release them to other processes Two exception Flag of processor 0 should always be 0 since that processor never needs to do exchanges with its left neighbor Flag of the last processor should always be 1 because the processor has no right neighbor to take control
35
SNU OOPSLA Lab. 35 Communicating Processes (7/8) Num 6 5 4 Flag 0 1 1 Procnum 00 01 10 Activity off on waiting Num 5 6 4 Flag 0 0 1 Procnum 00 01 10 Activity off waiting on Num 5 4 6 Flag 0 1 1 Procnum 00 01 10 Activity off on waiting Num 4 5 6 Flag 0 0 1 Procnum 00 01 10 Activity off waiting on Here is the modified parallel sorting program ParallelSortWithFlags.java
36
SNU OOPSLA Lab. 36 public class ParallelSortWithFlags { int i, num, n, flag, procnum; public void init() { // put code here to read num and n // we assume procnum holds the processor number if(procnum is even) flag = 0; else flag = 1; if(procnum == n-1) flag = 1; if((procnum > 0) && (procnum <= n-1)) { i = 1; while(i <= n) { wait until ((flag(this processor) == 1) && (flag(left neighbor) == 0)); if(num(left neighbor) > num(this processor)) { exchange num(left neighbor) and num(this processor); } if(procnum > 1) change flag in processor on left; if(procnum < n-1) change flag in this processor; i = i + 1; } ParallelSortWithFlags.java
37
SNU OOPSLA Lab. 37 Communicating Processes (8/8) Execution time T parallel = C * n (C: constant value) T sequential = C’ * n * log 2 n (C’: constant value) Since log 2 n is not a large number, t parallel is not a lot faster than t sequential Whenever involving interprocessor communication, Code can become very complex Improvements in execution time may not be as large as one would hope
38
SNU OOPSLA Lab. 38 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
39
SNU OOPSLA Lab. 39 Parallel Computation on a Saturated Machine (1/4) Previous studies have assumed that n is small enough For data retrieval problem, there would be a processor for every individual For Traveling Salesman Problem, there would be a processor for every ordering of the cities However, since n is usually large We will not have as many processors as could be used effectively It will be necessary to revise the organization of the code This is the case when the processors are saturated There are two major results The programming becomes more complicated Some of improvement in execution time is lost
40
SNU OOPSLA Lab. 40 Parallel Computation on a Saturated Machine (2/4) Example of data retrieval problem There may be thousands of individuals Their records are distributed across 100 processors We put 1 percent of the total population on each processors Each processor will search its own 1 percent of the whole Result of 100 separate computations will be a search of the complete population Here is the modified program SaturatedParallelPersonSearch.java
41
SNU OOPSLA Lab. 41 public class SaturatedParallelPersonSearch { double targetweight, targetheight, weight[], height[]; String name []; int m, i; public void init() { // put code here to find the number m of individuals to be stored in this processor // and then read the data for those individuals // processor 0 prints the message “ Give the target height ” processor 0 output.setText( “ Give the target height. ” ); // all processors receive the target height targetheight = all input.getDouble(); // processor 0 prints the message “ Give the target weight ” processor 0 output.setText( “ Give the target weight. ” ); // all processors receive the target weight targetweight = all input.getDouble(); i = 1; while(i <= m) { if((targetheight == height[i]) && (targetweight == weight[i])) this processor output.setText(name[i]); i = i +1 } SaturatedParallelPersonSearch.java
42
SNU OOPSLA Lab. 42 Parallel Computation on a Saturated Machine (3/4) Execution time of this program Between 1 and 100 individuals T 1-100 = 5.1 * 10 -4 * 1 The same as handling one individual on a sequential machine Between 101 and 200 individuals T 101-200 = 5.1 * 10 -4 * 2 Between 201 and 300 individuals T 201-300 = 5.1 * 10 -4 * 3 … Parallel computation is still much faster than a sequential computation
43
SNU OOPSLA Lab. 43 Parallel Computation on a Saturated Machine (4/4) Figure 14.5 Tsequential = 5.1 * 10 **( -4) * n Tparallel = 5.1 * 10**(- 4 )* 1
44
SNU OOPSLA Lab. 44 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture
45
SNU OOPSLA Lab. 45 Variations on Architecture (1) Figure 14.6
46
SNU OOPSLA Lab. 46 Variations on Architecture (2) One simple measure of performance Number of transfers required for information to reach the most distant points in a network For example, 16 processors in a ring connection Requires 8 movements along communication lines Following table gives the distance between the farthest processors for the four configurations
47
SNU OOPSLA Lab. 47 Variations on Architecture (3) MIMD(Multiple-Instruction Multiple-Data) machine Each processor has its own program to manipulate its own data Each processor can have a completely different piece of code SIMD(Single-Instruction Multiple-Data) machine One program controls all processors in the array That program broadcasts its commands to the complete network, and they all march in lockstep
48
SNU OOPSLA Lab. 48 Textbook: Table of Contents Motivation of Parallel Processing Limitations on Processor Speed Forms of Parallelism Parallel Computation Communicating Processes Parallel Computation on a Saturated Machine Variations on Architecture Summary
49
SNU OOPSLA Lab. 49 Summary The need for parallel processing is everywhere. The concept of parallel processing is fancy! But….. For intractable problems, even parallel processing does not help much Parallel programming is difficult A lot of research and development activities regarding parallel processing have been active and is now still active and will be active !
50
SNU OOPSLA Lab. 50 Ch14: Parallel Computation Text Review Time
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.