Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Loop Perforation to Dynamically Adapt Application Behavior to Meet Real-Time Deadlines Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant.

Similar presentations


Presentation on theme: "Using Loop Perforation to Dynamically Adapt Application Behavior to Meet Real-Time Deadlines Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant."— Presentation transcript:

1 Using Loop Perforation to Dynamically Adapt Application Behavior to Meet Real-Time Deadlines Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant Agawal and Martin Rinard CSAIL Massachusetts Institute of Technology Cambridge, MA 02139

2 Outline Introduction/Motivation –Problem –Solution: Loop Perforation Loop Perforation –Finding Loops to Perforate –Controlling Perforation Dynamically Experiments –Using Perforation to Adapt to Faults Conclusion 2

3 Problem Program is too slow Misses real-time deadlines 3

4 Solution: Loop Perforation Loop Perforation: –Do not execute all iterations –Skip some instead Profile Program Find loops that take the most time Perforate those loops for (i = 0; i < n; i++) { … } for (i = 0; i < n; i += 2) { … } A Perforated Program: Consumes fewer computational resources Runs faster, consumes less energy, or both Can meet its real-time deadlines! A Perforated Program: Consumes fewer computational resources Runs faster, consumes less energy, or both Can meet its real-time deadlines! Perforate: to make a hole through an object or structure 4

5 Loop Perforation (cont’d) Maintain Acceptable Quality of Service Don’t PerforatePerforate Increase Speed ? Q: Won’t perforation change the result? A: Yes, so we target applications that have a range of acceptable outputs 5

6 Static vs. Dynamic Perforation Static loop perforation –Speeds up an application for some QoS loss –Allows applications to be repurposed E.g., a broadcast video encoder can be transitioned to video conferencing Dynamic loop perforation –Allows full QoS unless something bad happens –When something bad happens system adapts to maintain speed Determine which loops to perforate using profiling Our implemented system supports both static and dynamic perforation, this talk focuses on dynamic perforation Our implemented system supports both static and dynamic perforation, this talk focuses on dynamic perforation 6

7 Outline Introduction/Motivation –Problem –Solution: Loop Perforation Loop Perforation –Finding Loops to Perforate –Controlling Perforation Dynamically Experiments –Using Perforation to Adapt to Faults Conclusion 7

8 A Perforating Compiler C/C++ Program C/C++ Program Representative Inputs Representative Inputs QoS Metric & Bound QoS Metric & Bound Perforatable Loops Perforatable Loops Responsibility of User Provided as input to the perforating compiler QoS bound – the maximum acceptable loss of QoS Perforating Compiler Maximizes speedup for QoS bound Discards loops which cause: Slow down Unacceptable QoS loss Dynamic errors in Valgrind Result Set of Perforatable Loops Speedup application given QoS bound Perforation may be dynamic Result Set of Perforatable Loops Speedup application given QoS bound Perforation may be dynamic This process is discussed in detail in: Misailovic, Sidiroglou, Hoffmann, Rinard. Quality of Service Profiling. To Appear, ICSE 2010 8 Find costly loops Profile Program Perforate Analyze QoS

9 Use PARSEC Benchmarks to Test Approach PARSEC Benchmarks* represent emerging workloads We pick seven benchmark applications for which we can define QoS metric –x264 (H.264 video encoding) –bodytrack (human movement tracking) –swaptions (financial analysis) –ferret (content-based similarity search) –canneal (engineering – circuit place & route) –blackscholes (financial analysis) –streamcluster (online approx. of k-means) We augment the benchmark suite with additional data sets and divide into –Training (about 25% of inputs) –Production (remaining 75% of inputs) *http://parsec.cs.princeton.edu/ 9

10 Performance/QoS Tradeoffs for PARSEC Benchmarks 10

11 Dynamically Controlling Perforation Application registers a heartbeat using Application Heartbeats API* Runtime monitors heartbeat Heartbeat too slow? –Increase perforation to trade QoS for increased performance Heartbeat too fast? –Decrease perforation to reclaim QoS Heartbeat API Perforation Selection Heartbeat API Perforation Selection Application Loop 1 Loop 2 Loop i Runtime Monitor *Hoffmann, Eastep, Santambrogio, Miller, Agarwal. Application Heartbeats for Software Performance and Health. PPoPP 2010 11

12 Outline Introduction/Motivation –Problem –Solution: Loop Perforation Loop Perforation –Finding Loops to Perforate –Controlling Perforation Dynamically Experiments –Using Perforation to Adapt to Faults Conclusion 12

13 Evaluation Methodology Two applications (from PARSEC benchmark suite): –x264 (media application performs H.264 video encoding) –bodytrack (computer vision application tracks a body through a scene) Two changing environments: –Core Failure: During execution 3 of 8 cores fail –Frequency Scaling: During execution clock frequency rises and falls For each app and scenario: –Goal: keep performance within.95 to 1.1x that of system with no failures –Measure: Baseline performance (no failure) Performance with failure and no perforation Performance with failure and dynamic perforation 13

14 x264 Core Loss Experiment Lose 3 of 8 cores 14

15 bodytrack Core Loss Experiment Lose 3 of 8 cores 15

16 bodytrack Results (Core Failure) Maintains track on head, chest, and legs despite loss of 37.5% of compute 16

17 x264 Frequency Scaling Experiment Frequency Drops (2.53 GHz → 1.6 GHz) Frequency Rises (1.6 GHz → 2.53 GHz) 17

18 bodytrack Frequency Scaling Experiment Frequency Drops (2.53 GHz → 1.6 GHz) Frequency Rises (1.6 GHz → 2.53 GHz) 18

19 bodytrack Results (Frequency Scaling) Perforation allows app to maintain track while frequency is low. When frequency rises again, high-quality track is reestablished. Perforation allows app to maintain track while frequency is low. When frequency rises again, high-quality track is reestablished. 19

20 Conclusion Presented loop perforation –Speedup programs by making performance/QoS tradeoffs –Showed as much as 2x speedup for 5% degradation in QoS Presented dynamic loop perforation –Allow system to detect performance loss and respond by perforating loops –Maintain performance in changing environment –Can respond to any environmental change that affects performance More detail on dynamic perforation available in: Hoffmann, Misailovic, Sidiroglou, Agarwal, Rinard. Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures. MIT-CSAIL-TR-2209- 042. August, 2009. 20

21 Backup 21

22 Number of loops Perforatable Loops in PARSEC Benchmarks 22

23 x264, Training 23

24 x264, Production 24

25 x264 Encoder Uncompressed Video Frame Sequence Compressed Video Stream 25

26 Motion Estimation Reference Frame Current Frame ? All Perforated Loops Are In Motion Estimation Computation 26

27 x264 Loop Nest Sum of Hadamard transformed differences loop nest (computes match metric between cur and ref blocks) short temp[4][4]; for (i = 0; i < h; i += 4 ) { for (j = 0; j < w; j += 4 ) { element_wise_subtract(temp, cur, ref, cs, rs); hadamard_transform(temp, 4); value += sum_abs_matrix(temp, 4); } cur += 4*cs; ref += 4*rs; } return value; 27

28 Perforated x264 Loop Nest Sum of Hadamard transformed differences loop nest (computes match metric between cur and ref blocks) short temp[4][4]; for (i = 0; i < h; i += 8 ) { for (j = 0; j < w; j += 8 ) { element_wise_subtract(temp, cur, ref, cs, rs); hadamard_transform(temp, 4); value += sum_abs_matrix(temp, 4); } cur += 4*cs; ref += 4*rs; } return value; Perforation Effect New block match metric Uses block with best match (as measured by metric) New metric works fine 28

29 Why Not Just Skip Motion Estimation? Runs 6.8 times faster But encoded video is 3.55 times bigger! 29

30 bodytrack Training 30

31 bodytrack Production 31

32 bodytrack Particle method Annealing layers Dispersed particles Compute with particles 32

33 bodytrack Next annealing layer Particle dispersion affected by previous layer Continue until done with annealing layers 33

34 bodytrack Loop for (i = 0; i < layers; i++) { disperse particles for layer do particle computation } 34

35 Perforated bodytrack Loop for (i = 0; i < layers; i += 2) { disperse particles for layer do particle computation } Perforation Effect Perform fewer annealing layers Perform less work, finish faster 35

36 Other Perforated Loops in bodytrack Concepts –bodytrack maintains probabilistic model of where body parts are in previous frame –Reads image data from 4 cameras –Performs image processing to get information about where it thinks body is in current frame –Computes probabilistic model for current frame Many perforated loops in error calculations –Between probabilistic model from previous frame –And image data from current frame –Used to obtain probabilistic model for current frame 36

37 37 Perforated Image Quality Panning camera


Download ppt "Using Loop Perforation to Dynamically Adapt Application Behavior to Meet Real-Time Deadlines Henry Hoffmann, Sasa Misailovic, Stelios Sidiroglou, Anant."

Similar presentations


Ads by Google