Presentation is loading. Please wait.

Presentation is loading. Please wait.

Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)

Similar presentations


Presentation on theme: "Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)"— Presentation transcript:

1 Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)

2  Motivation  General choices for adaptive optimization  ADAPT The Architecture The Language An example  Results

3 There’s only so much optimization that can be performed at compile-time.  Have to generate code for generic system models – make compile-time assumptions that may be sensitive to input, unknown till runtime.  Convergence of technologies – difficult to generate common binary to exploit individual system characteristics.

4 Possible solution? “Use of adaptive and dynamic optimization paradigms, where optimization is performed at runtime when complete system and input knowledge is available.”

5  Choose from statically generated code- variants +Easy -May not result in max possible optimization -Can result in code explosion  Parameterization +Single copy of source -May still not result in max possible optimization  Dynamic compilation +Complete input and system knowledge – max optimization possible -Considerable runtime overhead

6  Automated De-Coupled Adaptive Program Optimization  Generic framework, which leverages existing tools  Uses a domain-specific language, AL, by which adaptive techniques can be specified …

7  Supports dynamic compilation and parameterization  Enables optimizations through “runtime sampling”  Facilitates an iterative modification and search approach

8 3 functions of a dynamic/adaptive optimization system  Evaluate effectiveness of particular optimization for current input & system information  Apply optimization if profitable  Re-evaluate applied optimizations and tune according current runtime conditions

9

10 Runtime system consists of:  Modified version of application  Remote optimizer has source code description of target machine stand-alone tools & compilers  Local optimizer agent of remote-optimizer on system detects hot-spots tracks multiple interval contexts (here, loop bounds) runs in separate thread Optimization and execution truly asynchronous

11  LO invokes RO, when hotspot detected  RO tunes the interval using available tools, according to user-specified heuristics  RPC returns  If new code available, dynamically link to application as the new best/experimental version, depending on RO’s message

12

13  Candidate code sections have 2 control flow paths through best known version through experimental version Each of these can be replaced dynamically  Flag indicates which version to execute  Monitor experimental versions of each context collected data used as feedback if better, swap with best known version

14 Optimization process outside critical path/decoupled from execution

15  ADAPT Language (AL) *  Features: Uses an LL1 grammar => simple parser Domain specific language with C-style format Defines reserved words that at runtime contain useful input data and system information * “A full description of ADAPT language is beyond the scope of this paper”, and by extension, this presentation.

16

17  Initialize some variables  Constraints  Interface to tool to be used  This block defines the heuristic

18 StatementDescription constraint(compile- time constraint) Supplies a compile-time constraint apply_spec (condition,type, syntax[,params]) A description of a tool or flag collect (event list) execute; Initiates the monitoring of an experimental code version mark_as_best Specifies that the code variant that would be generated under the current runtime conditions is a new best known version end_phase Denotes the end of an optimization phase

19  Test Machines: 6 core Sun ULTRA Enterprise 4000, single-core Pentium II Linux workstation ExperimentResult Useless Copying - Run a dynamically compiled version of code without applying any optimization Less than ~5% Some cases show a speed-up! Specialization – Loop bounds replaced as constants by their runtime value. Average improvement: E4000: 13.6% Pentium: 2.2% Flag Selection – Experiment with various combinations of compiler flags Average improvement: E4000: 35% Pentium: 9.2% Identified some non-intuitive choices Loop Unrolling – Loop unrolled by factors that evenly divide no. of iterations of innermost loop to a maximum factor of 10. Average improvement: E4000: 18% Pentium: 5% Loop Tiling – Loops deemed appropriate tiled for ½, ¼,.., 1 / 16 of L2 cache size Average improvement: E4000: 13.5% Pentium: 9.8% Parallelization – Loops deemed appropriate by Polaris parallelized Average improvement: E4000: 51.8%

20  There’s advantage in doing runtime optimization  Can be applied to general-purpose programs as well  For full-blown runtime optimization, need to move optimization process outside the critical path

21 if (questions(“?!”) == 1) delay(); THANK_YOU(“Have a great weekend!”);


Download ppt "Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)"

Similar presentations


Ads by Google