Presentation is loading. Please wait.

Presentation is loading. Please wait.

INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.

Similar presentations


Presentation on theme: "INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10."— Presentation transcript:

1 INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10

2 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Design and implement of a task decomposition solution At the end of this part you should be able to: Define speedup and efficiency Use Amdahl’s Law to predict maximum speedup 2

3 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Speedup Speedup is the ratio between sequential execution time and parallel execution time For example, if the sequential program executes in 6 seconds and the parallel program executes in 2 seconds, the speedup is 3X 3 Speedup curves look like this Cores Speedup

4 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Efficiency A measure of core utilization Speedup divided by the number of cores Example Program achieves speedup of 3 on 4 cores Efficiency is 3 / 4 = 75% 4 Efficiency Cores Efficiency curves look like this

5 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Speedup Example Painting a picket fence –30 minutes of preparation (serial) –One minute to paint a single picket –30 minutes of cleanup (serial) Thus, 300 pickets takes 360 minutes (serial time) 5 Speedup and Efficiency

6 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Computing Speedup 6 Number of painters TimeSpeedup 130 + 300 + 30 = 3601.0X 230 + 150 + 30 = 2101.7X 1030 + 30 + 30 = 904.0X 10030 + 3 + 30 = 635.7X Infinite30 + 0 + 30 = 606.0X Speedup and Efficiency

7 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Efficiency Example 7 Number of painters TimeSpeedupEfficiency 13601.0X100% 230 + 150 + 30 = 2101.7X85% 1030 + 30 + 30 = 904.0X40% 10030 + 3 + 30 = 635.7X5.7% Infinite30 + 0 + 30 = 606.0Xvery low Speedup and Efficiency

8 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Idea Behind Amdahl’s Law 8 Cores Execution Time s s s s s 1-s (1-s )/2 (1-s )/3 (1-s )/5 (1-s )/4 Portion of computation that will be performed sequentially Portion of computation that will be executed in parallel

9 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Derivation of Amdahl’s Law Speedup is ratio of execution time on 1 core to execution time on p cores Execution time on 1 core is s + (1-s) Execution time on p cores is at least s + (1-s)/p 9

10 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Amdahl’s Law Is Too Optimistic Amdahl’s Law ignores parallel processing overhead Examples of this overhead include time spent creating and terminating threads Parallel processing overhead is usually an increasing function of the number of cores (threads) 10

11 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Graph with Parallel Overhead Added 11 Cores Execution Time Parallel overhead increases with # of cores

12 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Other Optimistic Assumptions Amdahl’s Law assumes that the computation divides evenly among the cores In reality, the amount of work does not divide evenly among the cores Core waiting time is another form of overhead 12 Task started Task completed Working time Waiting time

13 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Graph with Workload Imbalance Added 13 Cores Execution Time Time lost due to workload imbalance

14 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Illustration of the Amdahl Effect 14 n = 100,000 n = 10,000 n = 1,000 Cores Speedup Linear speedup

15 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Using Amdahl’s Law Program executes in 5 seconds Profile reveals 80% of time spent in function alpha, which we can execute in parallel What would be maximum speedup on 2 cores? New execution time ≥ 5 sec / 1.67 = 3 seconds 15

16 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Superlinear Speedup According to our general speedup formula, the maximum speedup a program can achieve on p cores is p Superlinear speedup is the situation where speedup is greater than the number of cores used It means the computational rate of the cores is faster when the parallel program is executing Superlinear speedup is usually caused because the cache hit rate of the parallel program is higher 16

17 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). 17

18

19 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. More General Speedup Formula 19 (n,p) Speedup for problem of size n on p cores (n) Time spent in sequential portion of code for problem of size n (n) Time spent in parallelizable portion of code for problem of size n (n,p) Parallel overhead

20 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Amdahl’s Law: Maximum Speedup 20 This term is set to 0 Assumes parallel work divides perfectly among available cores

21 Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. The Amdahl Effect 21 As n   these terms dominate Speedup is an increasing function of problem size


Download ppt "INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10."

Similar presentations


Ads by Google