Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University.

Similar presentations


Presentation on theme: "CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University."— Presentation transcript:

1 CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University of Notre Dame 11 Department of Computer Science and Engineering How Can Co-Design Help? The Salishan Conference on High-Speed Computing

2 CSE Dept., (XHU) 2 The Salishan conference on High-Speed Computing Theme: Exposing Hidden Execution Costs  Cost of execution: performance and power  Computation  Communication  Data motion  Synchronization ……  How can we strike a balance between the extremes?  Hide as much as possible?  Explicitly manage “all” costs?  My “position”:  Expose widely and choose wisely  Focus on power

3 CSE Dept., (XHU) 3 The Salishan conference on High-Speed Computing Why Taking the Position?  Expose widely  Better understanding the contribution by each component  Allowing application-specific tradeoffs  Providing opportunities for powerful co-design tools  Choose wisely  Requiring sophisticated co-design tools  Exploring more algorithm/software options

4 CSE Dept., (XHU) 4 The Salishan conference on High-Speed Computing But Easier Said Than Done!  Heterogeneity  Compute nodes: (multi-core) CPU, GP-GPU, FPGA, …  Memory components: on-chip, on-board, disks, …  Communication infrastructure: bus, NoC, networks, …  Parallelism (”non-determinism”)  Data access: movement, coherence, …  Resource contention  synchronization

5 CSE Dept., (XHU) 5 The Salishan conference on High-Speed Computing Outline  Why expose widely?  How to benefit from exposing widely?  How to choose wisely?  Going forward

6 CSE Dept., (XHU) 6 The Salishan conference on High-Speed Computing Why Expose Widely? (1)  Different programs has different power distribution Memory ConstSM ConstCache TextCache GPU Cores } Hong and Kim, ISCA 2010 GPU Power Distribution (NVidia GTX 280)

7 CSE Dept., (XHU) 7 The Salishan conference on High-Speed Computing Why Expose Widely? (2) Energy consumptions of three sorting algorithms (Pentium 4 + GeForce 570)  Data movement impacts different algorithms differently

8 CSE Dept., (XHU) 8 The Salishan conference on High-Speed Computing Why Expose Widely? (3)  Application dependent Massaki Kondo, et. al., SigARCH 2007 Performance degradation due to memory bus contention

9 CSE Dept., (XHU) 9 The Salishan conference on High-Speed Computing Outline  Why expose widely?  How to benefit from exposing widely?  How to choose wisely?  Going forward

10 CSE Dept., (XHU) 10 The Salishan conference on High-Speed Computing How to Benefit from “Exposing Widely”?  Co-design is the key  Expose all factors impacting the “execution model”  Computation: processing resource  Data motion: memory components and hierarchy  Communication: bus and network  Resource contention, synchronization…  Some examples  Software macromodeling  Hardware module-based modeling  Optimize through power management  Keep in mind Amdahl’s law

11 CSE Dept., (XHU) 11 The Salishan conference on High-Speed Computing Macromodeling: Algorithm Complexity Based  Relate power/energy of a program with its complexity  Example: E = C 1 S + C 2 S 2 + C 3 S 3 (Tan, et. al. DAC’01) where S is the size of the array for a sorting algorithm  Example: E comm = C 0 + C 1 S (Loghi, et. al. ACMTECS’07) where S is the size of exchanged messages  More sophisticated models to account for both computing and communication  How to handle resource contention?

12 CSE Dept., (XHU) 12 The Salishan conference on High-Speed Computing Power Modeling of Bus Contension  Penolazzi, Sander and Ahmed Hemani: DATE’11  Characterization step  C % N,1 : percentage of cycle difference between the N- processor case and 1-processor case  Can be one by IP providers on chosen benchmarks  Prediction step

13 CSE Dept., (XHU) 13 The Salishan conference on High-Speed Computing Hierarchical Module-Based Power Modeling  Accumulate energy/power of modules  CPU+GPU example  Access rate: software dependent  Data movement contributes to memory power  Resource contention modifies access rate Adapted from Isci and Martonosi, Micro’03

14 CSE Dept., (XHU) 14 The Salishan conference on High-Speed Computing Outline  Why expose widely?  How to benefit from exposing widely?  How to choose wisely?  Going forward

15 CSE Dept., (XHU) 15 The Salishan conference on High-Speed Computing Managing Bus Contention to Reduce Energy  M. Kondo, H. Sasaki and H. Nakamura, 2006  Counter for mem request  Register for PU identification  Thresholds for selecting which PU uses what V dd value

16 CSE Dept., (XHU) 16 The Salishan conference on High-Speed Computing Application Mapping to Reduce Energy (1)  Application mapping for heterogeneous systems J1J1 J2J2 J3J3 J4J4 ([minR 1,maxR 1 ], D 1 ) ([minR 2,maxR 2 ], D 2 ) PE 1 PE 2 PE 3 PE 4 Memory ([minR 4,maxR 4 ], D 4 ) ([minR 3,maxR 3 ], D 3 ) R. Racu, R. Ernst, A. Hamann, B. Mochocki and X. Hu, “Methods for power optimization in distributed embedded systems with real-time requirements,”, CASES’06.

17 CSE Dept., (XHU) 17 The Salishan conference on High-Speed Computing Application Mapping to Reduce Energy (2)  Optimization:  Minimize power/energy dissipation  Satisfying timing properties (e.g. average path latency, average lateness, etc.) ……  Search Space:  Scheduling parameter, traffic shaping, …  Task level DVFS, i.e. task speed assignment  Resource level DVFS, i.e., resource speed assignment ……

18 CSE Dept., (XHU) 18 The Salishan conference on High-Speed Computing Application Mapping (3): Sensitivity Analysis R. Racu, R. Ernst, A. Hamann, B. Mochocki and X. Hu, “Methods for power optimization in distributed embedded systems with real-time requirements,”, CASES’06.

19 CSE Dept., (XHU) 19 The Salishan conference on High-Speed Computing Application Mapping (4): GA-Based Approach Power Analyzer 2’. Scheduling Trace 3’. Power Dissipation Power model needed

20 CSE Dept., (XHU) 20 The Salishan conference on High-Speed Computing A Sample Result

21 CSE Dept., (XHU) 21 The Salishan conference on High-Speed Computing Outline  Why expose widely?  How to benefit from exposing widely?  How to choose wisely?  Going forward

22 CSE Dept., (XHU) 22 The Salishan conference on High-Speed Computing Going Forward: Systematic Co-design Effort  Expose more  More hardware counters / registers  More efficient/accurate high-level power models  Better models for resource contention and synchronization  Choose better  Handling parallelism  Algorithm, OS, hardware  Resource contention  synchronization  Handling non-determinism  Worst case bounds  Statistical analysis  Interval-based techniques

23 CSE Dept., (XHU) 23 The Salishan conference on High-Speed Computing ES Design v.s. HPCS Design  Differences (maybe)  Application specific workloads v.s. domain specific workloads  Constraints, objectives, desirables?  latency, throughput, energy, cost, reliability, fault tolerance, IP protection/privacy, ToM, …  Other issues: homogeneous v.s. heterogeneous, levels of complexity, user expertise,…  Similarities  Ever increasing hardware capability: multi-core, multi- thread, complex communication fabrics, memory hierarchy, …  Productivity gap  Common concerns: latency, throughput, energy, cost, reliability, fault tolerance, …

24 CSE Dept., (XHU) 24 The Salishan conference on High-Speed Computing Leverage Co-Design for HPC  Systematic performance estimation  Formal methods: scenario-based, statistical analysis  Hybrid approaches: analytical+simulation  Seamless migration from one abstraction level to the next  Efficient design space exploration  Efficient search techniques  Multiple-level abstraction models  Multiple-attribute optimization  Others: memory and communication analysis and design

25 CSE Dept., (XHU) 25 The Salishan conference on High-Speed Computing Thank you!


Download ppt "CSE Dept., (XHU) 1 The Salishan conference on High-Speed Computing No Free Lunch, No Hidden Cost X. Sharon Hu Dept. Computer Science and Engineering University."

Similar presentations


Ads by Google