Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS308 Compiler Theory lec00-outline April 17, 2017

Similar presentations


Presentation on theme: "CS308 Compiler Theory lec00-outline April 17, 2017"— Presentation transcript:

1 CS308 Compiler Theory lec00-outline April 17, 2017

2 Syntax-Directed Translation
lec05-syntaxdirected April 17, 2017 Syntax-Directed Translation Grammar symbols are associated with attributes to associate information with the programming language constructs that they represent. Values of these attributes are evaluated by the semantic rules associated with the production rules. Evaluation of these semantic rules: may generate intermediate codes may put information into the symbol table may perform type checking may issue error messages may perform some other activities in fact, they may perform almost any activities. An attribute may hold almost any thing. a string, a number, a memory location, a complex record. CS308 Compiler Theory

3 Syntax-Directed Definitions and Translation Schemes
When we associate semantic rules with productions, we use two notations: Syntax-Directed Definitions Translation Schemes Syntax-Directed Definitions: give high-level specifications for translations hide many implementation details such as order of evaluation of semantic actions. We associate a production rule with a set of semantic actions, and we do not say when they will be evaluated. Translation Schemes: indicate the order of evaluation of semantic actions associated with a production rule. In other words, translation schemes give a little bit information about implementation details. CS308 Compiler Theory

4 Syntax-Directed Definitions
A syntax-directed definition is a generalization of a context-free grammar in which: Each grammar symbol is associated with a set of attributes. This set of attributes for a grammar symbol is partitioned into two subsets called synthesized and inherited attributes of that grammar symbol. Each production rule is associated with a set of semantic rules. Semantic rules set up dependencies between attributes which can be represented by a dependency graph. This dependency graph determines the evaluation order of these semantic rules. Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also have some side effects such as printing a value. CS308 Compiler Theory

5 Syntax-Directed Definition -- Example
Production Semantic Rules L → E return print(E.val) E → E1 + T E.val = E1.val + T.val E → T E.val = T.val T → T1 * F T.val = T1.val * F.val T → F T.val = F.val F → ( E ) F.val = E.val F → digit F.val = digit.lexval Symbols E, T, and F are associated with a synthesized attribute val. The token digit has a synthesized attribute lexval (it is assumed that it is evaluated by the lexical analyzer). CS308 Compiler Theory

6 Translation Schemes In a syntax-directed definition, we do not say anything about the evaluation times of the semantic rules (when the semantic rules associated with a production should be evaluated?). A translation scheme is a context-free grammar in which: attributes are associated with the grammar symbols and semantic actions enclosed between braces {} are inserted within the right sides of productions. Ex: A → { ... } X { ... } Y { ... } Semantic Actions CS308 Compiler Theory

7 A Translation Scheme Example
A simple translation scheme that converts infix expressions to the corresponding postfix expressions. E → T R R → + T { print(“+”) } R1 R →  T → id { print(id.name) } a+b+c  ab+c+ infix expression postfix expression CS308 Compiler Theory

8 lec06-typechecking April 17, 2017 Type Checking A compiler has to do semantic checks in addition to syntactic checks. Semantic Checks Static – done during compilation Dynamic – done during run-time Type checking is one of these static checking operations. we may not do all type checking at compile-time. Some systems also use dynamic type checking too. A type system is a collection of rules for assigning type expressions to the parts of a program. A type checker implements a type system. A sound type system eliminates run-time type checking for type errors. A programming language is strongly-typed, if every program its compiler accepts will execute without type errors. In practice, some of type checking operations are done at run-time (so, most of the programming languages are not strongly-typed). Ex: int x[100]; … x[i]  most of the compilers cannot guarantee that i will be between 0 and 99 CS308 Compiler Theory

9 Intermediate Code Generation
lec07-intermediatecg April 17, 2017 Intermediate Code Generation Intermediate codes are machine independent codes, but they are close to machine instructions. The given program in a source language is converted to an equivalent program in an intermediate language by the intermediate code generator. Intermediate language can be many different languages, and the designer of the compiler decides this intermediate language. syntax trees can be used as an intermediate language. postfix notation can be used as an intermediate language. three-address code (Quadraples) can be used as an intermediate language we will use quadraples to discuss intermediate code generation quadraples are close to machine instructions, but they are not actual machine instructions. some programming languages have well defined intermediate languages. java – java virtual machine prolog – warren abstract machine In fact, there are byte-code emulators to execute instructions in these intermediate languages. CS308 Compiler Theory

10 Three-Address Code (Quadraples)
A quadraple is: x := y op z where x, y and z are names, constants or compiler-generated temporaries; op is any operator. But we may also the following notation for quadraples (much better notation because it looks like a machine code instruction) op y,z,x apply operator op to y and z, and store the result in x. We use the term “three-address code” because each statement usually contains three addresses (two for operands, one for the result). CS308 Compiler Theory

11 Arrays Elements of arrays can be accessed quickly if the elements are stored in a block of consecutive locations. A one-dimensional array A: baseA low i width baseA is the address of the first location of the array A, width is the width of each array element. low is the index of the first array element location of A[i]  baseA+(i-low)*width CS308 Compiler Theory

12 Arrays (cont.) baseA+(i-low)*width
can be re-written as i*width + (baseA-low*width) should be computed at run-time can be computed at compile-time So, the location of A[i] can be computed at the run-time by evaluating the formula i*width+c where c is (baseA-low*width) which is evaluated at compile-time. Intermediate code generator should produce the code to evaluate this formula i*width+c (one multiplication and one addition operation). CS308 Compiler Theory

13 Two-Dimensional Arrays (cont.)
The location of A[i1,i2] is baseA+ ((i1-low1)*n2+i2-low2)*width baseA is the location of the array A. low1 is the index of the first row low2 is the index of the first column n2 is the number of elements in each row width is the width of each array element Again, this formula can be re-written as ((i1*n2)+i2)*width + (baseA-((low1*n1)+low2)*width) should be computed at run-time can be computed at compile-time CS308 Compiler Theory

14 Multi-Dimensional Arrays
In general, the location of A[i1,i2,...,ik] is (( ... ((i1*n2)+i2) ...)*nk+ik)*width + (baseA-((...((low1*n1)+low2)...)*nk+lowk)*width) So, the intermediate code generator should produce the codes to evaluate the following formula (to find the location of A[i1,i2,...,ik]) : (( ... ((i1*n2)+i2) ...)*nk+ik)*width + c To evaluate the (( ... ((i1*n2)+i2) ...)*nk+ik portion of this formula, we can use the recurrence equation: e1 = i1 em = em-1 * nm + im CS308 Compiler Theory

15 Translation Scheme for Arrays
S  L := E { if (L.offset is null) emit(‘mov’ E.place ‘,,’ L.place) else emit(‘mov’ E.place ‘,,’ L.place ‘[‘ L.offset ‘]’) } E  E1 + E2 { E.place = newtemp(); emit(‘add’ E1.place ‘,’ E2.place ‘,’ E.place) } E  ( E1 ) { E.place = E1.place; } E  L { if (L.offset is null) E.place = L.place) else { E.place = newtemp(); emit(‘mov’ L.place ‘[‘ L.offset ‘]’ ‘,,’ E.place) } } CS308 Compiler Theory

16 translation of flow-of-control statements
S  if (E) S1 | if (E) S1 else S2 | while (E) S1 | S1 S2 S.next : the label that is attached to the first three-address code to be executed after the code for S

17 code for flow-of-control statements
E.code S1.code E.true: . . . to E.true to E.false (a) if-then E.code S1.code E.true: . . . to E.true to E.false E.false: goto S.next S2.code (b) if-then-else E.false: S.next: E.code S1.code E.true: . . . to E.true to E.false goto S.begin S.begin: (c) while-do E.false:

18 利用fall through E  E1 relop E2 {test = E1 relop E2
s = if E.true != fall and E.false != fall then gen(‘if’ test ‘goto’, E.true) || gen(‘goto’, E.false) else if (E.true != fall) then gen(‘if’ test ‘goto’, E.true) else if (E.false != fall) then gen(‘if’ ! test ‘goto’, E.false) else ‘’ E.code := E1.code || E2 .code || s }

19 backpatching allows generation of intermediate code in one pass (the problem with translation scheme before is that we have inherited attributes such as S.next, which is not suitable to implement in bottom-up parsers) idea: the labels (in the three-address code) will be filled when we know the places attributes: E.truelist (true exits:真标号表), E.falselist (false exits)

20 S  if E then M S1 {backpatch(E.truelist, M.quad); S.nextlist := merge(E.falselist, S1.nextlist} S  if E then M1 S1 N else M2 S2 {backpatch(E.truelist, M1.quad); backpatch(E.falselist, M2.quad); S.nextlist := merge(S1.nextlist, N.nextlist, S2.nextlist)}

21 Run-Time Environments
lec08-memoryorg April 17, 2017 Run-Time Environments How do we allocate the space for the generated target code and the data object of our source programs? The places of the data objects that can be determined at compile time will be allocated statically. But the places for the some of data objects will be allocated at run-time. The allocation and de-allocation of the data objects is managed by the run-time support package. run-time support package is loaded together with the generate target code. the structure of the run-time support package depends on the semantics of the programming language (especially the semantics of procedures in that language). Each execution of a procedure is called as activation of that procedure. CS308 Compiler Theory

22 Procedure Activations
An execution of a procedure starts at the beginning of the procedure body; When the procedure is completed, it returns the control to the point immediately after the place where that procedure is called. Each execution of a procedure is called as its activation. Lifetime of an activation of a procedure is the sequence of the steps between the first and the last steps in the execution of that procedure (including the other procedures called by that procedure). If a and b are procedure activations, then their lifetimes are either non-overlapping or are nested. If a procedure is recursive, a new activation can begin before an earlier activation of the same procedure has ended. CS308 Compiler Theory

23 Activation Tree (cont.)
main p s q s CS308 Compiler Theory

24 Run-Time Storage Organization
Memory locations for code are determined at compile time. Locations of static data can also be Data objects allocated at run-time. (Activation Records) Other dynamically allocated data objects at run-time. (For example, malloc area in C). Code Static Data Stack Heap CS308 Compiler Theory

25 Activation Records Information needed by a single execution of a procedure is managed using a contiguous block of storage called activation record. An activation record is allocated when a procedure is entered, and it is de-allocated when that procedure exited. Size of each field can be determined at compile time (Although actual location of the activation record is determined at run-time). Except that if the procedure has a local variable and its size depends on a parameter, its size is determined at the run time. CS308 Compiler Theory

26 Activation Records (cont.)
The returned value of the called procedure is returned in this field to the calling procedure. In practice, we may use a machine register for the return value. The field for actual parameters is used by the calling procedure to supply parameters to the called procedure. The optional control link points to the activation record of the caller. The optional access link is used to refer to nonlocal data held in other activation records. The field for saved machine status holds information about the state of the machine before the procedure is called. The field of local data holds data that local to an execution of a procedure.. Temporay variables is stored in the field of temporaries. return value actual parameters optional control link optional access link saved machine status local data temporaries CS308 Compiler Theory

27 Access to Nonlocal Names
Scope rules of a language determine the treatment of references to nonlocal names. Scope Rules: Lexical Scope (Static Scope) Determines the declaration that applies to a name by examining the program text alone at compile-time. Most-closely nested rule is used. Pascal, C, .. Dynamic Scope Determines the declaration that applies to a name at run-time. Lisp, APL, ... CS308 Compiler Theory

28 Access Links Access Links program main; var a:int; procedure p;
q(1) i,b: q(0) s c: p d: program main; var a:int; procedure p; var d:int; begin a:=1; end; procedure q(i:int); var b:int; procedure s; var c:int; begin p; end; begin if (i<>0) then q(i-1) else s; end; begin q(1); end; Access Links CS308 Compiler Theory

29 Displays An array of pointers to activation records can be used to access activation records. This array is called as displays. For each level, there will be an array entry. 1: 2: 3: Current activation record at level 1 Current activation record at level 2 Current activation record at level 3 CS308 Compiler Theory

30 Accessing Nonlocal Variables using Display
main access link a: p b: q c: D[1] D[2] D[3] addrC := offsetC(D[3]) addrB := offsetB(D[2]) addrA := offsetA(D[1]) ADD addrA,addrB,addrC program main; var a:int; procedure p; var b:int; begin q; end; procedure q(); var c:int; begin c:=a+b; end; begin p; end; CS308 Compiler Theory

31 Issue in the Design of a Code Generator
General tasks in almost all code generators: instruction selection, register allocation and assignment. The details are also dependent on the specifics of the intermediate representation, the target language, and the run-time system. The most important criterion for a code generator is that it produce correct code. Given the premium on correctness, designing a code generator so it can be easily implemented, tested, and maintained is an important design goal.

32 Instruction Selection
The nature of the instruction set of the target machine has a strong effect on the difficulty of instruction selection. For example, The uniformity and completeness of the instruction set are important factors. Instruction speeds and machine idioms are another important factor. If we do not care about the efficiency of the target program, instruction selection is straightforward. x = y + z  LD R0, y ADD R0, R0, z ST x, R0 a = b + c  LD R0, b d = a + e ADD R0, R0, c ST a, R0 LD R0, a ADD R0, R0,e ST d, R0 Redundant

33 Register Allocation A key problem in code generation is deciding what values to hold in what registers. Efficient utilization is particularly important. The use of registers is often subdivided into two subproblems: Register Allocation, during which we select the set of variables that will reside in registers at each point in the program. Register assignment, during which we pick the specific register that a variable will reside in. Finding an optimal assignment of registers to variables is difficult, even with single-register machine. Mathematically, the problem is NP-complete.

34 A Simple Target Machine Model
lec08-memoryorg April 17, 2017 A Simple Target Machine Model Our target computer models a three-address machine with load and store operations, computation operations, jump operations, and conditional jumps. The underlying computer is a byte-addressable machine with n general-purpose registers. Assume the following kinds of instructions are available: Load operations Store operations Computation operations Unconditional jumps Conditional jumps • Load operations: The instruction LD dst, addr loads the value in location addr into location dst. This instruction denotes the assignment dst = addr. The most common form of this instruction is LD 1', x which loads the value in location x into register r. An instruction of the form LD 1'1 , 1'2 is a register-to-register copy in which the contents of register 1'2 are copied into register 1'1 . • Store operations: The instruction S1 x, r stores the value in register r into the location x. This instruction denotes the assignment x = r . • Computation operations of the form O P dst, srCl , srC2 , where O P is a operator like ADD or SUB, and dst, srCl , and srC2 are locations, not necessarily distinct. The effect of this machine instruction is to apply the operation represented by OP to the values in locations srCI and srC2 , and place the result of this operation in location dst. For example, SUB rl , r2 , 1'3 computes ri = r2 - 1'3 . Any value formerly stored in rl is lost, but if 1'1 is r2 or 1'3 , the old value is read first. Unary operators that take only one operand do not have a src2 . Unconditional jumps: The instruction BR L causes control to branch to the machine instruction with label L. (BR stands for branch.) • Conditional jumps of the form Bcond r, L, where r is a register, L is a label, and cond stands for any of the common tests on values in the register r. For example, BLT2 r, L causes a jump to label L if the value in register r is less than zero, and allows control to pass to the next machine instruction if not.

35 Basic Blocks and Flow Graphs
lec08-memoryorg April 17, 2017 Basic Blocks and Flow Graphs Introduce a graph representation of intermediate code that is helpful for discussing code generation Partition the intermediate code into basic blocks The basic blocks become the nodes of a flow graph, whose edges indicate which blocks can follow which other blocks. The basic blocks are maximal sequences of consecutive three-address instructions with the properties that (a) The flow of control can only enter the basic block through the first instruction in the block. That is, there are no jumps into the middle of the block. (b) Control will leave the block without halting or branching, except possibly at the last instruction in the block. CS308 Compiler Theory

36 Optimization of Basic Blocks
Local optimization within each basic block Global optimization which looks at how information flows among the basic blocks of a This chapter focuses on the local optimization CS308 Compiler Theory

37 DAG Representation of Basic Blocks
lec08-memoryorg April 17, 2017 DAG Representation of Basic Blocks Construct a DAG for a basic block 1. There is a node in the DAG for each of the initial values of the variables appearing in the basic block. 2. There is a node N associated with each statement s within the block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s, of the operands used by s. 3. Node N is labeled by the operator applied at s, and also attached to N is the list of variables for which it is the last definition within the block. 4. Certain nodes are designated output nodes. These are the nodes whose variables are live on exit from the block; that is, their values may be used later, in another block of the flow graph. Calculation of these "live variables" is a matter for global flow analysis, discussed in the following Section 9.2.5 CS308 Compiler Theory

38 Finding Local Common Sub expressions
lec08-memoryorg April 17, 2017 Finding Local Common Sub expressions How about if b and d are live on exit? However, if both b and d are live on exit, then a fourth statement must be lised to copy the value from one to the other. When we look for common sub expressions, we really are looking for expressions that are guaranteed to compute the same value , no matter how that value is computed. Thus, the DAG method will miss the fact that the expression computed by the 1st and 4th statements in the sequence a = b + c; b = b - d c = c + d e = b + c is the same, namely b + c . Algebraic identities applied to the DAG , as discussed in Section 8.5.4, may expose the equivalence. CS308 Compiler Theory

39 Dead Code Elimination Delete from a DAG any root (node with no ancestors) that has no live variables attached. Repeated application of this transformation will remove all nodes from the DAG that correspond to dead code. Example: assume a and b are live but c and e are not. e , and then c can be deleted. CS308 Compiler Theory

40 The Use of Algebraic Identities
Eliminate computations Reduction in strength Constant folding 2*3.14 = evaluated at compile time Other algebraic transformations x*y=y*x x>y and x-y>0 a= b+c; e=c+d+b; e=a+d; CS308 Compiler Theory

41 Representation of Array References
lec08-memoryorg April 17, 2017 Representation of Array References x = a[i] a[j]=y killed node 由于无法确保i是否等于j,当相等时,a[j]=y使得原a[i]值改变,导致1、3不等效 CS308 Compiler Theory

42 Reassembling Basic Blocks From DAG 's
b is not live on exit b is live on exit CS308 Compiler Theory

43 Register and Address Descriptors
lec08-memoryorg April 17, 2017 Register and Address Descriptors Descriptors are necessary for variable load and store decision. Register descriptor For each available register Keeping track of the variable names whose current value is in that register Initially, all register descriptors are empty Address descriptor For each program variable Keeping track of the location (s) where the current value of that variable can be found Stored in the symbol-table entry for that variable name. Our code-generation algorithm considers each three-address instruction in turn and decides what loads are necessary to get the needed operands into registers. After generating the loads, it generates the operation itself. Then, if there is a need to store the result into a memory location, it also generates that store. CS308 Compiler Theory

44 The Code-Generation Algorithm
lec08-memoryorg April 17, 2017 The Code-Generation Algorithm Function getReg(I) Selecting registers for each memory location associated with the three-address instruction I. Machine Instructions for Operations For a three-address instruction such as x = y + z, do the following: 1. Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry, and Rz . 2 . If y is not in Ry (according to the register descriptor for Ry) , then issue an instruction LD Ry , y' , where y' is one of the memory locations for y (according to the address descriptor for y) . 3. Similarly, if z is not in Rz , issue an instruction LD Rz, z’ , where z’ is a location for z. 4. Issue the instruction ADD Rx , Ry , Rz. Function getReg has access to the register and address descriptors for all the variables of the basic block, and may also have access to certain useful data-flow information such as the variables that are live on exit from the block. In a three-address instruction such as x = y + Z, we shall treat + as a generic operator and ADD as the equivalent machine instruction. We do not, therefore, take advantage of commutativity of +. CS308 Compiler Theory

45 CS308 Compiler Theory

46 Peephole Optimization
The peephole is a small, sliding window on a program. Peephole optimization, is done by examining a sliding window of target instructions and replacing instruction sequences within the peephole by a shorter or faster sequence, whenever possible. Peephole optimization can be applied directly after intermediate code generation to improve the intermediate representation. CS308 Compiler Theory

47 Eliminating Unreachable Code
An unlabeled instruction immediately following an unconditional jump may be removed. This operation can be repeated to eliminate a sequence of instructions. CS308 Compiler Theory

48 Flow-of- Control Optimizations
Unnecessary jumps can be eliminated in either the intermediate code or the target code by peephole optimizations. Suppose there is only one jump to L1 CS308 Compiler Theory

49 Algebraic Simplification and Reduction in Strength
Algebraic identities can be used to eliminate three-address statements x = x+0; x=x*1 Reduction-in-strength transformations can be applied to replace expensive operations x2 ; power(x, 2); x*x Fixed-point multiplication or division; shift Floating-point division by a constant can be approximated as multiplication by a constant CS308 Compiler Theory

50 Use of Machine Idioms The target machine may have hardware instructions to implement certain specific operations efficiently. Using these instructions can reduce execution time significantly. Example: some machines have auto-increment and auto-decrement addressing modes. The use of the modes greatly improves the quality of code when pushing or popping a stack as in parameter passing. These modes can also be used in code for statements like x = x + 1 . CS308 Compiler Theory

51 Register Allocation and Assignment
Efficient utilization of registers is vitally important in generating good code. This section presents various strategies for deciding at each point in a program : what values should reside in registers (register allocation) and in which register each value should reside (register assignment) . CS308 Compiler Theory

52 lec08-memoryorg April 17, 2017 Usage Counts Keeping a variable x in a register for the duration of a loop L Save one unit for each use of x save two units if we can avoid a store of x at the end of a block. An approximate formula for the benefit to be realized from allocating a register x within loop L is where use(x, B) is the number of times x is used in B prior to any definition of x, live(x, B) is 1 if x is live on exit from B and is assigned a value in B, and live(x, B) is 0 otherwise. X必须在block中被赋值,才能省1,不用store了 CS308 Compiler Theory

53 Code optimization Elimination of unnecessary instructions
Replacement of one sequence of instructions by a faster sequence of instructions Local optimization Global optimizations based on data flow analyses CS308 Compiler Theory

54 Causes of Redundancy Redundant operations are
at the source level a side effect of having written the program in a high-level language Each of high-level data-structure accesses expands into a number of low-level arithmetic operations Programmers are not aware of these low-level operations and cannot eliminate the redundancies themselves. By having a compiler eliminate the redundancies The programs are both efficient and easy to maintain. CS308 Compiler Theory

55 Common Subexpressions
Previously computed The values of the variables not changed Local: CS308 Compiler Theory

56 Common Subexpressions
lec08-memoryorg April 17, 2017 Common Subexpressions Global Any other common subexpression? CS308 Compiler Theory

57 Copy Propagation Copy statements or Copies u = v CS308 Compiler Theory

58 Dead-Code Elimination
Live variable A variable is live at a point in a program if its value can be used subsequently; otherwise, it is dead at that point. Constant folding Deducing at compile time that the value of an expression is a constant and using the constant instead CS308 Compiler Theory

59 CS308 Compiler Theory

60 Code Motion An important modification that decreases the amount of code in a loop Loop-invariant computation An expression that yields the same result independent of the number of times a loop is executed Code Motion takes loop-invariant computation before its loop while (i <= limit-2) t = limit -2 while (i <= t) CS308 Compiler Theory

61 Induction Variables and Reduction in Strength
For an induction variable x, there is a positive or negative constant c such that each time x is assigned, its value increases by c Induction variables can be computed with a single increment (addition or subtraction) per loop iteration Strength reduction The transformation of replacing an expensive operation, such as multiplication, by a heaper one, such as addition Induction variables lead to strength reduction eliminate computation CS308 Compiler Theory

62 Now We have: Inside-out lec08-memoryorg April 17, 2017 B1:如何处理?
B2+B3导致:内部i,j可以删除,B4如何处理? CS308 Compiler Theory

63 Data-Flow Analysis Techniques that derive information about the flow of data along program execution paths Examples One way to implement global common sub expression elimination requires us to determine whether two identical expressions evaluate to the same value along any possible execution path of the program. If the result of an assignment is not used along any subsequent execution path, then we can eliminate the assignment as dead code. CS308 Compiler Theory

64 lec08-memoryorg April 17, 2017 Reaching Definitions A definition d reaches a point p if there is a path from the point immediately following d to p, such that d is not "killed" along that path. A definition of a variable x is killed if there is any other definition of x anywhere along the path. Conservative if we do not know whether a statement s is assigning a value to x, we must assume that it may assign to it. 下面从两个方面讨论到达定义的限制问题。 用于基本块内删除无用定义,优化循环内不变计算等。 CS308 Compiler Theory

65 Live-Variable Analysis
lec08-memoryorg April 17, 2017 Live-Variable Analysis In live-variable analysis we wish to know for variable x and point p whether the value of x at p could be used along some path in the flow graph starting at p. If so, we say x is live at p; otherwise, x is dead at p. Definitions: 1. defB: the set of variables defined in B prior to any use of that variable in B 2. useB: the set of variables whose values may be used in B prior to any definition of the variable. any variable in useB must be considered live on entrance to block B, while definitions of variables in defB definitely are dead at the beginging of B. 用于基本块内删除无用变量计算。如果出基本块后,死掉,那么可以删除。 CS308 Compiler Theory

66 Available Expressions
lec00-outline April 17, 2017 Available Expressions An expression x + y is available at a point p: if every path, from the entry node to p evaluates x + y, and after the last such evaluation prior to reaching p, there are no subsequent assignments to x or y. A block kills expression x + y : if it assigns (or may assign) x or y and does not subsequently recompute x + y. A block generates expression x + y : if it definitely evaluates x + y and does not subsequently define x or y. 用于 公共子表达式的删除。 CS308 Compiler Theory

67 Available Expressions
lec08-memoryorg April 17, 2017 Available Expressions Let IN[B] be the set of expressions that are available before B OUT[B] be the same for the point following the end of B e_genB be the expressions generated by B e_killB be the set of expressions killed in B Then For all basic blocks B other than ENTRY 使用交集是因为:an expression is available at the beginning of a block only if it is available at the end of all its predecessors. CS308 Compiler Theory

68 Partial-Redundancy Elimination
Minimize the number of expression evaluations Consider all possible execution sequences in a flow graph, and look at the number of times an expression such as x + y is evaluated. By moving around the places where x + y is evaluated and keeping the result in a temporary variable when necessary, we often can reduce the number of evaluations of this expression along many of the execution paths. CS308 Compiler Theory

69 The Sources of Redundancy
CS308 Compiler Theory

70 Anticipation of Expressions
lec08-memoryorg April 17, 2017 Anticipation of Expressions An expression b + c is anticipated at point p if all paths leading from the point p eventually compute the value of the expression b + c from the values of b and c that are available at that point. When we insert expressions, we should ensure that no extra operations are executed. That is, copies of an expression must be placed only at program points where the expression is anticipated. CS308 Compiler Theory

71 Loops in Flow Graphs Loops are important because programs spend most of their time executing them Optimizations that improve the performance of loops can have a significant impact. CS308 Compiler Theory

72 Dominators Say node d of a flow graph dominates node n, written d dom n, if every path from the entry node of the flow graph to n goes through d. Every node dominates itself. Which nodes are dominated by each node? CS308 Compiler Theory

73 Finding Dominators lec08-memoryorg April 17, 2017 n0为起点;
P(n)代表节点n的前驱结点集; 计算第3page的Dom set。 CS308 Compiler Theory

74 Edges in a Depth-First Spanning Tree
lec08-memoryorg April 17, 2017 Edges in a Depth-First Spanning Tree Advancing edges: going from a node m to a proper descendant of m in the tree Retreating edges: going from a node m to an ancestor of m in the tree (possibly to m itself) . Cross edges: edges m  n such that neither m nor n is an ancestor of the other in the DFST 向前边,先后边,横向边 CS308 Compiler Theory

75 Back Edges and Reducibility
lec08-memoryorg April 17, 2017 Back Edges and Reducibility A back edge is an edge a b whose head b dominates its tail a. For any flow graph, every back edge is retreating, but not every retreating edge is a back edge. A flow graph is said to be reducible if all its retreating edges in any depth-first spanning tree are also back edges. 回边:假设a->b是流图中一条有向边,如果b DOM a,则称a->b是回边。 一个流图称为可规约的,当且仅当流图中除去回边外,其余边构成一个无环路流图。 图是否可规约?否。 1 DOM 2,3; 2个可能的depth-first spanning tree,因此,3->2或者2->3是先后边,但不是回边。 如何计算回边?1. 计算DOM set,比如D(n1)={n2,n3}; 2. 若图中有 n1n2的边,则n1n2为回边 CS308 Compiler Theory

76 Natural Loops A natural loop is defined by two essential properties.
1 . It must have a single-entry node, called the header. This entry node dominates all nodes in the loop, or it would not be the sole entry to the loop. 2. There must be a back edge that enters the loop header. Otherwise, it is not possible for the flow of control to return to the header directly from the "loop" ; i.e., there really is no loop. CS308 Compiler Theory

77 代码优化 – 补充 局部优化 循环优化

78 (1)i:=m (2)j:=n (3)t1:=4*n (4)v:=a[t1] B1 (5)i:=i (6)t2:=4*i (7)t3:=a[t2] (8)if t3<v goto(5) B2 (9)j:=j (10)t4:=4*j (11)t5:=a[t4] (12)if t5>v goto(9) B3 (13)if i>=j goto(23) B4 (14)t6:=4*i (15)x:=a[t6] (16)t7:=4*i (17)t8:=4*j (18)t9:=a[t8] (19)a[t7]:=t9 (20)t10:=4*j (21)a[t10]:=x (22)goto (5) (23)t11:=4*i (24)x:=a[t11] (25)t12:=4*i (26)t13:=4*n (27)t14:=a[t13] (28)a[t12]:=t14 (29)t15:=4*n (30)a[t15]:=x B5 B6

79 局部优化 局部优化就是基本块内的优化。局部优化包括以下几种方法: 1. 合并已知量 2. 删除公共子表达式 3. 删除无用赋值
4. 删除死代码

80 1. 合并已知量 对于语句: A := OP B 或 A := B OP C
lec00-outline April 17, 2017 1. 合并已知量 对于语句: A := OP B 或 A := B OP C 若B和C为常数,则编译时可将A的值计算出来,存放在临时单元T中,相应语句换成: A := T Constant folding

81 2. 删除公共子表达式 对于语句: A := B + C*D
U := V – C*D 如果两个语句之间C和D的值未改变,则第二个语句可使用第一个语句的计算结果T: U := V – T

82 3. 删除无用赋值 如有语句: A := B + C …… A := M + N 在两个语句之间没有使用过A,则第一个语句可以删除:

83 4. 删除死代码 语句: if B then S1 else S2

84 例题: (1) F:=1 (2) C:=F+E (3) D:=F+3 (4) B:=A*A (5) G:=B-D (6) H:=E
(7) I:=H*G (8) J:=D/4 (9) K:=J+C (10) L:=H (11) L:=I-J 第一步:合并已知量,由(1) F:=1,可得: (1) F:=1 (2) C:=1+E (3) D:=4 (4) B:=A*A (5) G:=B-4 (6) H:=E (7) I:=H*G (8) J:=1 (9) K:=2+E (10) L:=H (11) L:=I-1

85 第二步:删除公共子表达式,由(6)将(7)改为:
(1) F:=1 (2) C:=1+E (3) D:=4 (4) B:=A*A (5) G:=B-4 (6) H:=E (7) I:=E*G (8) J:=1 (9) K:=2+E (10) L:=H (11) L:=I-1 第三步:删除无用赋值 (10); 其它可能的无用赋值 (1)、(2)、(3)、(6)、(8)。 (4) B:=A*A (5) G:=B-4 (7) I:=E*G (9) K:=2+E (11) L:=I-1

86 循环/全局 优化 循环优化是一种重要的全局优化方法。循环是程序中重复执行的代码序列,一个程序运行的大部分时间是用在循环上的,因此循环优化对提高程序的执行效率具有非常重要的意义。循环优化包括以下几种方法: 1. 代码外提 2. 强度削弱 3. 删除归纳变量

87 1. 代码外提 对x:=op y或x:=y op z,如果y、z均为循环不变量(常数或定值点在L之外),则该运算为循环不变运算,优化时将该运算提到循环入口结点之前所增设的结点中去。

88 2. 强度削弱 基本归纳变量:i := i  c (c为常数) 同族归纳变量:j := c1*i  c2 (c1、c2为常数)
基本归纳变量i每循环一次增加或减少c,与i同族的归纳变量j相应增加或减少c1*c。因此,计算j的乘法可由加法来代替: j := j + c1*c (c1*c为常数)

89 3. 删除归纳变量 用同族归纳变量作为判断条件,如基本归纳别无它用,则可将其删除。
例如:j = 10 * i + 5,判断条件为 i > 10 ,则 将 i > 10 改为 j > 105,同时删除i相关的语句。

90 全局优化实例 (1)i:=1 B1 B2 (2)if i>10 goto (16) (3)t1:=2*j (4)t2:=10*i
(5)t3:=t2+t (6)t4:=a0-11 (7)t5:=2*j (8)t6:=10*i (9)t7:=t6+t (10)t8:=a0-11 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (14)i:=i+1 (15)goto (2) B3 B4 (16) …...

91 1. 代码外提 (3) (6) (7) (10) (1)i:=1 B1 B2 (2)if i>10 goto (16)
(3)t1:=2*j (4)t2:=10*i (5)t3:=t2+t (6)t4:=a0-11 (7)t5:=2*j (8)t6:=10*i (9)t7:=t6+t (10)t8:=a0-11 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (14)i:=i+1 (15)goto (2) B3 B4 (16) …...

92 2. 强度削弱 (4) (5) (8) (9) B1 (1)i:=1 (3)t1:=2*j (6)t4:=a0-11 B2’
(2)if i>10 goto (16) (4)t2:=10*i (5)t3:=t2+t1 (8)t6:=10*i (9)t7:=t6+t5 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (14)i:=i+1 (15)goto (2) B3 B4 (16) …...

93 3.删除归纳变量 (2) (14) B1 (1)i:=1 (3)t1:=2*j (6)t4:=a0-11
(4)t2:=10*i (5)t3:=t2+t1 (8)t6:=10*i (9)t7:=t6+t5 B2’ B2 (2)if i>10 goto (16) B3 (4’)t2:=t (5’)t3:=t3+10 (8’)t6:=t (9’)t7:=t7+10 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (14)i:=i+1 (15)goto (2) B4 (16) …...

94 4.其它优化 B1 (1)i:=1 (3)t1:=2*j (6)t4:=a0-11 (7)t5:=2*j (10)t8:=a0-11
lec00-outline April 17, 2017 (1)i:=1 B1 4.其它优化 (3)t1:=2*j (6)t4:=a0-11 (7)t5:=2*j (10)t8:=a0-11 (4)t2:=10*i (8)t6:=10*i (5)t3:=t2+t (9)t7:=t6+t5 (2’)s:=100+t1 B2’ (2’’)if t3>s goto (16) B2 B3 (4’)t2:=t (5’)t3:=t3+10 (8’)t6:=t (9’)t7:=t7+10 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (15)goto (2’’) 删除无用计算 B4 (16) …...

95 5. 其它优化后 B1 (1)i:=1 (3)t1:=2*j (6)t4:=a0-11 (7)t5:=2*j (10)t8:=a0-11
(4)t2:=10*i (8)t6:=10*i (5)t3:=t2+t (9)t7:=t6+t5 (2’)s:=100+t1 B2’ (2’’)if t3>s goto (16) B2 B3 (5’)t3:=t (9’)t7:=t7+10 (11)t9:=t8[t7] (12)t10:=t9+1 (13)t4[t3]:=t10 (15)goto (2’’) B4 (16) …...


Download ppt "CS308 Compiler Theory lec00-outline April 17, 2017"

Similar presentations


Ads by Google