Download presentation

Presentation is loading. Please wait.

Published bySidney Cleve Modified over 2 years ago

1
Whole-Program Linear-Constant Analysis with Applications to Link-Time Optimization Ludo Van Put – Dominique Chanet – Koen De Bosschere Ghent University http://diablo.elis.ugent.be

2
Ludo Van Put - Ghent University - SCOPES 2007 2 Link-time optimization compiler linker executable program object code libraries link-time optimizer modified program

3
Ludo Van Put - Ghent University - SCOPES 2007 3 Link-time optimization executable program modified program link-time binary rewriting Interesting for embedded systems, e.g. ARM [De Bus et al., 2004]: Reduce code size (~15%) Improve performance (~10%) Reduce energy consumption (~8%) Can we go even further?

4
Ludo Van Put - Ghent University - SCOPES 2007 4 Outline motivation theory –linear-constant equations & analysis practice –data structure & operations experience –ARM optimizations conclusion

5
Ludo Van Put - Ghent University - SCOPES 2007 5 We could do better but… memory & stack: black box low level code: no compile time information need more (complex) analyses to further exploit whole- program overview

6
Ludo Van Put - Ghent University - SCOPES 2007 6 Propagate linear expressions machine instructions introduce simple relations between registers: e.g. ADD r0,r4,#4 → r0 - r4 + 4 = 0 constant information we can propagate through the graph and use it not a new idea [Karr, 1976], but a different context: huge graph, simple instructions, fixed number of registers, explicit memory accesses & addresses →less general relations and less complex computations

7
Ludo Van Put - Ghent University - SCOPES 2007 7 Dataflow analysis program statement data set A data set A’ Domain: sets of linear-constant equations, partially ordered under set inclusion of their ‘closure’, {r0 - r1+ a = 0} {r0 - r1 + a = 0, r1 - r2 + b = 0} + operator ‘intersection of closures’ : semi-lattice Transfer function: transforms the data set, reflects program semantics Program instructions invalidate and create linear- constant equations.

8
Ludo Van Put - Ghent University - SCOPES 2007 8 Linear-constant equations Here: y = x + c, y and x registers, c integer constant Combining linear-constant equations: x 1 – y 1 + c 1 = 0 + x 2 – y 2 + c 2 = 0 then x 1 = y 2 or y 1 = x 2 Closure of set: all combinations Restriction: underdetermined or exactly determined sets –by construction in straight-line code –by intersection of closure at merge points

9
Ludo Van Put - Ghent University - SCOPES 2007 9 Data set representation assign registers unique number: r 0 …r n-1 add virtual zero-valued register, r : – MOV r0, #8 → r 0 - r - 8 = 0 –constant propagation normalize set of equations (r x – r y + c = 0) r 0 - r - 8 = 0 r 1 - r - 4 = 0 r 3 – r 2 – 12 = 0 r 0 – r 1 - 4 = 0 r 1 - r - 4 = 0 r 2 – r 3 + 12 = 0 ≠≠ x < y

10
Ludo Van Put - Ghent University - SCOPES 2007 10 What about addresses? Graph representation: addresses are meaningless, symbolic references instead: ADDR r2, $reference Many addressproducers at link-time: optimization opportunity Extend linear-constant equations: r x – r y + c – addr x + addr y = 0 Combination requirement: addr x1 = addr y2 or addr y1 = addr x2 r 0 - r - 8 = 0 r 1 - r - 4 – ref 1 = 0 r 3 – r 2 – 12 = 0 r 0 – r 1 - 4 + ref 1 = 0 r 1 - r - 4 – ref 1 = 0 r 2 – r 3 + 12 = 0

11
Ludo Van Put - Ghent University - SCOPES 2007 11 Compact, fast data structure at most n equations: fixed-size array 0: r 0 – r 1 - 4 + ref 1 = 0 1: r 1 - r - 4 – ref 1 = 0 2: r 2 – r 3 + 12 = 0 3: Redundant! Reuse it for speedup 0: r 0 r 1 -4 ref 1 1: r 0 r -4 ref 1 2: r 2 r 3 +12 3: r 2

12
Ludo Van Put - Ghent University - SCOPES 2007 12 Operations lookup: index, O(1) remove register: index, combination, O(1) insert equation –r x = r y : change constant c, O(1) –r x ≠ r y : combine until normalized, O(n) meet operation: compute closures, intersect closures and normalize resulting set, each O(n 2 )

13
Ludo Van Put - Ghent University - SCOPES 2007 13 Example applications copy elimination, remove redundant code stack analysis & dead spill code elimination

14
Ludo Van Put - Ghent University - SCOPES 2007 14 Remove redundant code

15
Ludo Van Put - Ghent University - SCOPES 2007 15 Dead spill code ARM spill code: multiple-load & multiple-stores, e.g. STMDB sp!,{r4,r5,r6,r7,ra} saves 5 registers on the stack What if (one of) these registers is dead? Can we remove it from the list? Arguments are read using explicit offsets argument spilled registers local registers sp argument spilled registers local registers sp

16
Ludo Van Put - Ghent University - SCOPES 2007 16 Stack analysis at start of a procedure: {r sp - r - ref sp = 0} propagate through procedure, assuming: –function calls restore r sp –spilled registers & local variables cannot be accessed by callee mark instructions that use r sp + offset remove spill code & adjust offsets where needed proc entry LDR r2,[sp,#32] proc exit … … r sp - r - ref sp = 0 ?

17
Ludo Van Put - Ghent University - SCOPES 2007 17 Compaction & power savings ARM ADS 1.1 toolchain ‘–Os’, benchmarks from De Bus et al., 2004 without lcawith lca & optimizations % of original

18
Ludo Van Put - Ghent University - SCOPES 2007 18 Conclusion fast and practical link-time analysis enabler for new analyses and optimizations further reduction in code size & energy consumption program understanding, stack analysis

19
Ludo Van Put - Ghent University - SCOPES 2007 19 http://diablo.elis.ugent.be

20
Ludo Van Put - Ghent University - SCOPES 2007 20 Compaction & power savings D: Diablo without linear-constant analysis D+CA: Diablo with linear-constant analysis Toolchain: ARM ADS1.1

21
Ludo Van Put - Ghent University - SCOPES 2007 21 x86 argument forwarding x86 has few general purpose registers many constants x86 passes arguments via the stack: limitation for constant propagation stack analysis detects argument definitions & uses push %ebp mov %esp,%ebp mov 0x8(%ebp),%eax … push $0x4 call …

22
Ludo Van Put - Ghent University - SCOPES 2007 22 Results

Similar presentations

OK

Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.

Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on forward rate agreement Ppt on topic health and medicine Ppt on types of web browser Ppt on google driverless car Ppt on information technology in education Ppt on total parenteral nutrition administration Agriculture templates free download ppt on pollution Ppt on solar energy and its applications Ppt on hydrogen fuel cell Ppt on file tracking system