Download presentation

Presentation is loading. Please wait.

Published byPedro Penniston Modified over 3 years ago

1
Performance Measurement n Assignment? n Timing #include double When() { struct timeval tp; gettimeofday(&tp, NULL); return((double)tp.tv_sec + (double)tp.tv_usec * 1e-6); }

2
n Paper Schedule –22 Students –6 Days –Look at the schedule and email me your preference. Quickly.

3
A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors: –execution time –scalability –efficiency

4
A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors: n Also must take into account the costs: –memory requirements –implementation costs –maintenance costs etc.

5
A Quantitative Basis for Design n Parallel programming is an optimization problem. n Must take into account several factors: n Also must take into account the costs: n Mathematical performance models are used to asses these costs and predict performance.

6
Defining Performance n How do you define parallel performance? n What do you define it in terms of? n Consider –Distributed databases –Image processing pipeline –Nuclear weapons testbed

7
Amdahl's Law n Every algorithm has a sequential component. n Sequential component limits speedup Sequential Component Maximum Speedup = 1/s = s

8
Amdahl's Law s Speedup

9
What's wrong? n Works fine for a given algorithm. –But what if we change the algorithm? n We may change algorithms to increase parallelism and thus eventually increase performance. –May introduce inefficiency

10
Metrics for Performance n Speedup n Efficiency n Scalability n Others …………..

11
Speedup SpeedP Speed S 1 What is Speed? What algorithm for Speed1? What is the work performed? How much work?

12
Two kinds of Speedup n Relative –Uses parallel algorithm on 1 processor –Most common n Absolute –Uses best known serial algorithm –Eliminates overheads in calculation.

13
Speedup n Algorithm A –Serial execution time is 10 sec. –Parallel execution time is 2 sec. n Algorithm B –Serial execution time is 2 sec. –Parallel execution time is 1 sec. n What if I told you A = B?

15
Efficiency p S E The fraction of time a processor spends doing useful work

16
Cost (Processor-Time Product) p pTC p = # processors C T E s

17
Performance Measurement n Algorithm X achieved speedup of 10.8 on 12 processors. –What is wrong? n A single point of reference is not enough! n What about asymptotic analysis?

18
Performance Measurement n There is not a perfect way to measure and report performance. n Wall clock time seems to be the best. n But how much work do you do? n Best Bet: –Develop a model that fits experimental results.

19
Parallel Programming Steps n Develop algorithm n Develop a model to predict performance n If the performance looks ok then code n Check actual performance against model n Report the performance

20
Performance Evaluation n Identify the data n Design the experiments to obtain the data n Report data

21
Performance Evaluation n Identify the data –Execution time –Be sure to examine a range of data points n Design the experiments to obtain the data n Report data

22
Performance Evaluation n Identify the data n Design the experiments to obtain the data –Make sure the experiment measures what you intend to measure. –Remember: Execution time is max time taken. –Repeat your experiments many times –Validate data by designing a model n Report data

23
Performance Evaluation n Identify the data n Design the experiments to obtain the data n Report data –Report all information that affects execution –Results should be separate from Conclusions –Present the data in an easily understandable format.

24
Finite Difference Example n Finite Difference Code n 512 x 512 x 5 Elements n 16 IBM RS6000 workstations n Connected via Ethernet

25
Finite Difference Model n Execution Time –ExTime = (Tcomp + Tcomm)/P n Communication Time –Tcomm = 2*lat + 4*bw*n*z n Computation Time –Estimate using some sample runs

26
Estimated Performance

27
Finite Difference Example

28
What was wrong? n Ethernet n Change the computation of Tcomm –Reduce the bandwith –Tcomm = 2*lat + 4*bw*n*z*P/2

29
Finite Difference Example

Similar presentations

OK

MPI Program Performance. Introduction Defining the performance of a parallel program is more complex than simply optimizing its execution time. This is.

MPI Program Performance. Introduction Defining the performance of a parallel program is more complex than simply optimizing its execution time. This is.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on temples of india in hindi Ppt on gear pump Ppt on technical analysis of indian stocks Ppt on the art of war Ppt on synthesis and degradation of purines and pyrimidines a g Ppt on wireless network security Free ppt on save environment Ppt on machine translation Action words for kids ppt on batteries Ppt on computer software download