Presentation is loading. Please wait.

Presentation is loading. Please wait.

软件调优基础 2004 年 2 月 23 日. 为什么需要调优? 相同的代码 >> 不同的性能 SELFRELEASE OPT : 4 IMSLCXMLATLASMKL50MKL51 16.676s5.445s5.457s10.996s3.328s0.762s0.848s0.738s for(i=0;i<NUM;i++)

Similar presentations


Presentation on theme: "软件调优基础 2004 年 2 月 23 日. 为什么需要调优? 相同的代码 >> 不同的性能 SELFRELEASE OPT : 4 IMSLCXMLATLASMKL50MKL51 16.676s5.445s5.457s10.996s3.328s0.762s0.848s0.738s for(i=0;i<NUM;i++)"— Presentation transcript:

1 软件调优基础 2004 年 2 月 23 日

2 为什么需要调优? 相同的代码 >> 不同的性能 SELFRELEASE OPT : 4 IMSLCXMLATLASMKL50MKL51 16.676s5.445s5.457s10.996s3.328s0.762s0.848s0.738s for(i=0;i<NUM;i++) { for(j=0;j<NUM;j++) { for(k=0;k<NUM;k++) { c[i][j] =c[i][j] + a[i][k] * b[k][j]; } for(i=0;i<NUM;i++) { for(k=0;k<NUM;k++) { for(j=0;j<NUM;j++) { c[i][j] =c[i][j] + a[i][k] * b[k][j]; }

3

4

5 目标  明确性能调优的主要任务  定义一些重要的性能调优术语  利用 Intel 工具提供帮助

6 Agenda

7 调优循环 分析数据并得出结 论 测试结果 修改代码实现优 化 确定修改方法来 解决问题 从这里开始 收集性能数据

8 When (why) to Start  User Requirement?  Software Vendor Requirement?  Put Performance Requirement into the Requirements Document  Performance should be considered at every stage of the product life cycle (Requirements Gathering, Design, and Testing)  Exception: Do “code tuning” after the simple/readable non-optimized version of the application exists.

9 工作 vs. 效果

10 When to Stop  Architecture is at Maximum Efficiency? Be sure you know what this is: Calculate Theoretical Maximum Be sure you know what this is: Calculate Theoretical Maximum  Performance Requirement is satisfied Incrementally do Wide Mesh Optimizations 2 until done Incrementally do Wide Mesh Optimizations 2 until done

11 调优原则 We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Donald Knuth Quality Code is: – Portable – Readable – Maintainable – Reliable Intelligently Sacrifice Quality for Performance

12 Agenda

13 收集性能数据  Timer  Use to get wall clock time  Accuracy, Low Overhead  Use Intel ® VTune™ Performance Analyzer  Profiler: Gather Information about Code Usage  Performance Monitor: Gather Information about System Resource Usage

14 工作量  A good workload should have these characteristics:  measurable  reproducible  static  representative

15 分析数据得出结论  Baseline Current Performance  Examine Hot Spots  Identify Bottlenecks  Calculate Potential Maximum Performance

16 Examine Hot Spots Examine Hot Spots  The Pareto Principle, a.k.a. the 80/20 Rule  Concentrate on the vital few vs. the trivial many  Hot Spot: 应用或系统中占主要运算量的部分  Generally consists of a Loop  For Applications that don’t have hot spots, examine:  Memory Layout  Exceptions  Effective Compiler Usage

17 额外内容  Big O  Utilization, Efficiency, Throughput, Latency  Bottlenecks  I/O, Memory, CPU  MIPS/FLOPS/CPI  Concurrency, Parallelism  Scalability  Loads/Stores per Calculation

18 Agenda

19 优化设计层次  问题定义  系统结构  算法和数据结构  代码调优  系统软件  系统硬件

20 代码调优  汇编指令级  内部函数  C++ 向量类库  多线程  循环转化  编译器及参数  性能库 Hardest to develop and maintain Easiest to develop, port and maintain Hardest to develop and maintain Easiest to develop, port and maintain

21 Code Tuning  If Parallel Processing  Break Algorithm up across Clusters (Distributed Memory)  Single Node Optimization  Break Algorithm up across Processors (SMP)

22 修改代码实现优化  Use Intel® Libraries  Use Various Compiler Switches  Find out if the compiler or hardware does the enhancements automatically - before implementing yourself  Modify Source (i.e. Loop Transformations, SWP, SIMD, OpenMP, Intrinsics, Assembly)

23 Test!  Make sure Applications still runs correctly (Regression Testing)  Make sure enhancement actually increases performance  Calculate Speed-up  Decide if you’re done optimizing

24 Speed-Up Speed-Up = Optimized Time Baseline Time Speed-Up =Optimized Throughput Baseline Throughput The Two Basic Formulas

25 Summary  Optimization Tasks  Gather Performance Data  Analyze Data & Identify Issues  Generate Alternatives to Resolve Issue  Implement Enhancements  Test Results  Use Intel® Software Development Tools for every step in the process


Download ppt "软件调优基础 2004 年 2 月 23 日. 为什么需要调优? 相同的代码 >> 不同的性能 SELFRELEASE OPT : 4 IMSLCXMLATLASMKL50MKL51 16.676s5.445s5.457s10.996s3.328s0.762s0.848s0.738s for(i=0;i<NUM;i++)"

Similar presentations


Ads by Google