Download presentation

Presentation is loading. Please wait.

Published byNathalie Loveday Modified over 3 years ago

1
Load-Reuse Analysis: Design and Evaluation Rastislav Bodik, Rajiv Gupta, Mary Lou Soffa PLDI'99 Presented by Sue Ann Hong 4/11/2006

2
This paper: find as many reuses as possible Load-reuse Example Register Promotion 1.Load-reuse analysis Identify loads & stores to the same addr on a path a1 = a4 on path p1? a2 = a4 on path p2? 2.Alias analysis Make sure load value isn’t changed a0 = a4? 3.Program transformation e.g. partial redundancy elimination hoist ‘load a4’ to path p3 load a1store a2 store a0 load a4 path p1path p2path p3

3
Related Work Lexical load-reuse analysis Only loads with identical names Value numbering x = 5; t1 = x; t2 = x; Only copy assignments for (i=0; i < N-2; i++) { A[i+2] = A[i] } Remember the hash tables…

4
Paper Does This Its load-reuse algorithm The ideal run-time reuse finder “Profile-based Estimator” Compare: How many reuses they find, on SPEC95, of course… “ground truth”

5
Evaluating the Algorithm Comparing to Ideal Reuse Analysis Ideal Reuse Analysis (dynamic = run-time) –Generally undecidable use simulation: (Simple) remember access history for each memory inst and find prior load or store –Want tight upper bound Ignore possible (input-dependent, sporadic) reuses as noise while ( c = read() ) { … = hashtbl[ hash(c) ]; } –Still, how input-independent is the simulation? Identified reuse level (SPEC95) –See p67. Tall bars… Something like 55% of overall loads are reuses. old history = expensive, tends to be a little bit of ≤ 18% So reuse-analysis is probably worth it. Note: they do show empirically that # of accesses in history > 1 doesn’t matter too much.

6
Load Reuse Analysis A must-alias analysis Value Name Graph (Data-flow analysis) An addr value flows between two addr exprs if they access the same addr (they’re equivalencies). 3 steps for 3 goods 1.Symbolic interpretation Find equivalences after algebraic simplification; Create synthetic names 2.Symbolic value numbering Use the synthetic names, and backward flow from temps, find equivalences due to assignment to temps 3.Data-flow analysis Connect the equivalences from prev steps along specific paths Remember the hash tables… store(2x+12); ‘2x+12’ y = 2x + 8; - z = load (y+4); ‘2x+12’

7
Profile-based Estimators Intuition –Reuse-analysis which path contains what reuses f(p i ) є Z –Ideal analysis how many reuses overall? n –n = Σ i [f(p i ) * how many times path is used] Crazy 5 different estimators lower and upper bounds to compensate for edge profiling errors Estimator; use profiling

8
Experiments Figure 8 on p75. How do you interpret that thing?? How possible aliasing could make reuses useless. Ideal found ~55% of loads have reuse Their analysis found ~80% of those. Other than that, the paper doesn’t really have conclusions. What happened after this paper (1999)? blah Ask the next dude.

9
Discussions from class Bodik’s notion of defining and comparing to ideal performance is different from the usual approach of giving overall optimization performance. In fact, he’s famous for not giving numbers for run time optimization. Is this orthogonal to cache optimization? Yes. The paper doesn’t address cache/locality-related issues. I probably shouldn’t have laughed at the author for saying “Such an amount of registers [>34] will be soon available in general-purpose processors.” Peter’s PowerBook was able to display my presentation in contrast to my Sony.

Similar presentations

OK

1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.

1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google