Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fakultät für informatik informatik 12 technische universität dortmund Worst-Case Execution Time Analysis - Session 19 - Heiko Falk TU Dortmund Informatik.

Similar presentations


Presentation on theme: "Fakultät für informatik informatik 12 technische universität dortmund Worst-Case Execution Time Analysis - Session 19 - Heiko Falk TU Dortmund Informatik."— Presentation transcript:

1 fakultät für informatik informatik 12 technische universität dortmund Worst-Case Execution Time Analysis - Session 19 - Heiko Falk TU Dortmund Informatik 12 Germany Slides use Microsoft cliparts. All Microsoft restrictions apply.

2 - 2 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Schedule of the course TimeMondayTuesdayWednesdayThursdayFriday 09:30- 11:00 1: Orientation, introduction 2: Models of computation + specs 5: Models of computation + specs 9: Mapping of applications to platforms 13: Memory aware compilation 17: Memory aware compilation 11:00 Brief break 11:15- 12:30 6: Lab*: Ptolemy 10: Lab*: Scheduling 14: Lab*: Mem. opt. 18: Lab*: Mem. opt. 12:30Lunch 14:00- 15:20 3: Models of computation + specs 7: Mapping of applications to platforms 11: High-level optimizations* 15: Memory aware compilation 19: WCET & compilers* 15:20Break 15:40- 17:00 4: Lab*: Kahn process networks 8: Mapping of applications to platforms 12: High-level optimizations* 16: Memory aware compilation 20: Wrap-up * Dr. Heiko Falk

3 - 3 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Outline Worst-Case Execution Time  Definition  Static WCET Analysis  WCET-aware Compilation and Optimization Function Specialization / Procedure Cloning  Introduction and Code Examples  Function Specialization and WCET EST  Results References & Summary

4 - 4 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Worst-Case Execution Time Definition of the Worst-Case Execution Time (WCET):  Upper bound for the execution time of a program  irrespective of the program’s input data. Problem: Determination of a program’s actual WCET is not computable! (WCET computation can be reduced to the Halting Problem)

5 - 5 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Estimation of Worst-Case Execution Times Solution: Estimation of upper bounds for the actual (unknown) WCET Requirements on WCET estimates:  Safeness: WCET  WCET EST !  Tightness: WCET EST – WCET  minimal

6 - 6 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of the static WCET-Analyzer aiT Input: Binary executable program P to be analyzed. exec2crl: Disassembler, translates P into aiT’s own LIR CRL2. Value Analysis: Determines potential contents of processor registers for any point in P’s execution time. Note: P is never executed, emulated or simulated by aiT! Only static analyses are applied.

7 - 7 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of the static WCET-Analyzer aiT Loop Bound Analysis: Tries to determine lower and upper iteration bounds for each loop of P. Cache Analysis: Uses formal cache model, classifies each memory access within P as guaranteed cache hit or as cache miss.

8 - 8 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of the static WCET-Analyzer aiT Pipeline Analysis: Includes accurate model of processor pipeline. Depending on initial pipeline states, possible cache states etc., possible final pipeline states are determined per basic block. Result of pipeline analysis is the WCET EST of each basic block.

9 - 9 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of the static WCET-Analyzer aiT Path Analysis: Models all possible execution paths of P under consideration of the WCET EST of all basic blocks and computes the overall longest execution path leading to the global WCET EST of P. Output of aiT is e.g. the length of this longest path, i.e. the WCET EST of P.

10 - 10 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Complexity of Static WCET-Analysis Problem: WCET analysis intractable for computers! If WCET were computable, one could solve the Halting Problem by checking in O(1) if WCET <  holds. Reason: It is not computable how long P stays in a loop. Automatic loop bound analysis only feasible for simple classes of loops. (Similar problems arise for recursive functions.)

11 - 11 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund User Annotations for WCET Analysis Way out: aiT user must provide information about e.g. minimal / maximal loop iterations and recursion depths. Annotations file: Contains such user annotations (“Flow Facts”) and is – amongst program P to be analyzed – one more input to aiT. [AbsInt Angewandte Informatik GmbH, http://www.absint.com ]

12 - 12 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund ICD-C Parser ANSI C ICD-C IR Code Selector LLIR Register Allocator LLIR  CRL2 aiT WCET Analysis CRL2 + WCET EST CRL2  LLIR WCET- Opt. ASM Analyses, Optimi- zations A WCET-Aware C-Compiler (WCC) Distinctive Feature: Tight integration of WCET analyzer aiT into WCC’s backend.

13 - 13 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Tight Coupling WCC  aiT CRL2: Low-level intermediate representation (LIR) of aiT. Structure of CRL2: Control Flow Graph (CFG) consisting of routines, basic blocks, instructions and operations. WCET analysis applied to low-level machine code  coupling WCC  aiT done in compiler backend. Converter LLIR2CRL: translates WCC’s LIR into CRL2, executes aiT in the background. After successful WCET analysis: relevant WCET data extracted from CRL2, imported into WCC.

14 - 14 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Imported WCET Data & Flow Facts in WCC Most important WCET data imported into WCC:  WCET EST of the entire program  WCET EST of each single basic block  Worst-Case execution frequencies of each CFG edge Flow Fact Annotation within WCC:  Annotation of e.g. loop iteration bounds directly in C source code: _Pragma( “loopbound min 10 max 10” );  Since compiler optimizations may restructure loops and thus their annotated bounds, WCC automatically keeps Flow Facts consistent during all applied optimizations.

15 - 15 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Problems during WCET EST Minimization The Worst-Case Execution Path (WCEP):  WCET of a program P = length of longest execution path of P (WCEP)  To minimize P’s WCET EST, optimizations must exclusively focus on those parts of P lying on the WCEP.  Optimization of parts not lying on the WCEP don’t reduce WCET EST at all!  Optimization strategies for WCET EST Minimization must have detailed knowledge about the WCEP. WCET Analyzer aiT provides this detailed information about the WCEP, but…

16 - 16 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Switching WCEPs (1) main() { if ( … ) a(); else d(); return; } a() { … b(); … } b() { … c(); … } d() { … c(); … } main a b c d 10 Cyc. 50 Cyc. 80 Cyc. 65 Cyc. 120 Cyc. WCET EST of a for a in Main Mem. Example: Scratchpad allocation of functions for WCET EST minimization Above: Function Call Graph

17 - 17 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Switching WCEPs (2) main() { if ( … ) a(); else d(); return; } a() { … b(); … } b() { … c(); … } d() { … c(); … } main a b c d 10 Cyc. 50 Cyc. 80 Cyc. 65 Cyc. 120 Cyc. WCET EST = 205 Cyc. Initial WCEP: main, a, b, c Length of WCEP = WCET EST : 205 SPM-Allocation of b reduces WCET EST of b down to 40

18 - 18 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Switching WCEPs (3) main() { if ( … ) a(); else d(); return; } a() { … b(); … } b() { … c(); … } d() { … c(); … } main a b c d 10 Cyc. 50 Cyc. 40 Cyc. 65 Cyc. 120 Cyc. Initial WCEP: main, a, b, c Length of WCEP = WCET EST : 205 SPM-Allocation of b reduces WCET EST of b down to 40

19 - 19 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Switching WCEPs (4) main() { if ( … ) a(); else d(); return; } a() { … b(); … } b() { … c(); … } d() { … c(); … } main a b c d 10 Cyc. 50 Cyc. 40 Cyc. 65 Cyc. 120 Cyc. WCET EST = 195 Cyc. New WCEP: main, d, c New WCET EST : 195  WCEP has switched as side effect of optimization!

20 - 20 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Consequences for Compiler Optimizations Optimizations aiming at WCET EST minimization…  …always have to consider that the WCEP may switch after each and every decision taken by the optimization.  …should not take their decision where to optimize on the basis of local information. Instead, they should always consider the global effects of a decision. (SPM allocation of b in previous example leads to a local reduction of b ’s WCET EST by 40 cycles. However, a global reduction of only 10 cycles was achieved!) Challenge: To design completely new optimization strategies matching the above requirements and always considering the global CFG with its current WCEP.

21 - 21 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Function Specialization Also known as “Procedure Cloning”:  Well-known optimization technique (1993)  Goals: To enable further standard optimizations and to reduce overhead for parameter passing during function calls  Approach: For worthwhile functions, specialized copies (“Clones”) are created having less parameters than their original versions.

22 - 22 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Typical Function Calls int f( int *x, int n, int p ) { for (i=0; i<n; i++) { x[i] = p * x[i]; if ( i == 10 ) {... } } return x[n]; } int main() {... f( y, 5, 2 )...... f( z, 5, 2 )... return f( a, 5, 2 ); } Observations:  f is called several times.  f has three parameters.  Constants 5 and 2 are passed several times for parameters n and p.

23 - 23 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Further Observations int f( int *x, int n, int p ) { for (i=0; i<n; i++) { x[i] = p * x[i]; if ( i == 10 ) {... } } return x[n]; } int main() {... f( y, 5, 2 )...... f( z, 5, 2 )... return f( a, 5, 2 ); } f is so-called general purpose function: Via parameter n, an array x of arbitrary size can be processed. Control flow within f depends on function parameter. Within main, f is used for special purpose: Processing of arrays of size n = 5.

24 - 24 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund int f( int *x, int n, int p ) { for (i=0; i<n; i++) { x[i] = p * x[i]; if ( i == 10 ) {... } } return x[n]; } int main() {... f_5_2( y )...... f_5_2( z )... return f_5_2( a ); } int f_5_2( int *x ) { for (i=0; i<5; i++) { x[i] = 2 * x[i]; if ( i == 10 ) {... } } return x[5]; } Specialization of General Purpose Routines

25 - 25 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Expected Effects of Function Specialization Reduction of the Average-Case Execution Time (ACET):  Enabling of further standard optimizations in cloned function: Constant propagation and constant folding Strength reduction Control flow simplification  Less code to be executed in order to pass arguments to cloned function. Increase of Code Size:  Potentially, many new highly specialized clones are generated.

26 - 26 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Why Function Specialization and WCET EST ? Motivation:  In Embedded Software: General Purpose functions heavily used in special purpose contexts.  In particular, loop bounds are very often controlled by function parameters.  Recall: Loop bounds are extremely critical for a precise WCET analysis.  Function Specialization enables the extremely precise annotation of loop bounds! [P. Lokuciejewski, H. Falk et al., Influence of Procedure Cloning on WCET Prediction, Salzburg, CODES+ISSS 2007]

27 - 27 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Loop Bound Annotations & Cloning int f( int *x, int n, int p ) { // loopbound min 0, max n for (i=0; i<n; i++ ) { x[i] = p * x[i]; if ( i == 10 ) {... } } return x[n]; } Original Code: Loop annotations extremely imprecise since annotation must cover all calls of f in a safe way. If f is called somewhere with n = 2000 as maximal value, the annotation must always consider a maximum of 2000 iterations. For calls like e.g. f(a, 5, 2), 2000 iterations are considered as well  Massive WCET overestimation.

28 - 28 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Loop Bound Annotations & Cloning int f_5_2( int *x ) { // loopbound min 5, max 5 for (i=0; i<5; i++) { x[i] = 2 * x[i]; if ( i == 10 ) {... } } return x[5]; } Specialized Code: Loop annotations extremely precise since annotation covers all calls of f_5_2 exactly. For calls like e.g. f_5_2(a), exactly 5 iterations are considered  Exact WCET estimation. Original function f or other additional clones of f fully independent of annotations within f_5_2  no interdependencies during WCET analysis for all these different clones of f.

29 - 29 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of Function Specialization Attention when applying Function Specialization:  Code size increases should be kept clearly in mind during specialization!  Cloning of each function which is called at least twice with the same constant parameter usually unacceptable. Parameterized Application of Function Specialization:  Size G: Never specialize a function f having a size larger than G.  Fraction K of same constant parameters: clone f only iff at least K% of all calls of f use the same constant parameters.

30 - 30 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Workflow of Function Specialization Function Specialization Const. Folding & Propagation Strength Reduction If-Statement Optimization G = 2000 ICD-C Expressions K = 50% Considered Processors: Infineon TriCore TC1796 ARM 7 TDMI (ARM & THUMB instruction sets)

31 - 31 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Relative WCET EST after Function Specialization WCET EST improvements between 13% and up to 95%!

32 - 32 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Relative ACETs after Function Specialization Only marginal ACET improvements of max. 3%.  Influence of overhead for parameter passing and effect of successive optimizations on ACET obviously marginal.

33 - 33 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Discussion of Results Very astonishing: How can one optimization impact two apparently similar objectives like WCET and ACET so differently? Reasons:  Code before specialization only poorly annotatable  aiT provides very imprecise WCET estimates  Code after specialization very precisely annotatable  aiT provides highly precise WCET estimates Function Specialization obviously effective in increasing the tightness of WCET EST (cf. Slide 4). The actually unknown (since intractable) WCET should be reduced by only a similar order of magnitude as the ACET.

34 - 34 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Relative Code Sizes after Specialization Code size increases between 2% up to 325%!

35 - 35 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund References Function Specialization:  D. Bacon, S. Graham, O. Sharp, Compiler Transformations for High-Performance Computing, ACM Computing Surveys 26 (4), 1994.  P. Lokuciejewski, H. Falk, M. Schwarzer et al., Influence of Procedure Cloning on WCET Prediction, CODES+ISSS, Salzburg 2007.

36 - 36 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Summary WCET  Upper bound of a program’s execution time  Impossible to compute a program’s actual WCET  Static WCET analysis using aiT Function Specialization / Procedure Cloning  Specialization of general purpose functions  Enabling of standard compiler optimizations in specialized functions, simplification of code required for parameter passing  Small impact on ACET, huge impact on WCET EST

37 - 37 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund 12th International Workshop on Software & Compilers for Embedded Systems  April 23-24, 2009 Acropolis, Nice, France  Co-located with DATE 2009  Paper Submission: around mid November 2008 Website & CfP: http://www.scopesconf.org Mailing List: http://www.scopesconf.org/list

38 - 38 - technische universität dortmund fakultät für informatik  h. falk, informatik 12, 2008 TU Dortmund Coffee/tea break (if on schedule) Q&A?


Download ppt "Fakultät für informatik informatik 12 technische universität dortmund Worst-Case Execution Time Analysis - Session 19 - Heiko Falk TU Dortmund Informatik."

Similar presentations


Ads by Google