Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compiler Principles Fall 2015-2016 Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev.

Similar presentations


Presentation on theme: "Compiler Principles Fall 2015-2016 Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev."— Presentation transcript:

1 Compiler Principles Fall 2015-2016 Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev

2 Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Lowering Operational Semantics Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2

3 Previously 3 Becoming parsing ninjas – Going from text to an Abstract Syntax Tree By Admiral Ham [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

4 From scanning to parsing 59 + (1257 * xPosition) )id*num(+ Lexical Analyzer program text token stream Parser Grammar: E  id E  num E  E + E E  E * E E  ( E ) + num x * Abstract Syntax Tree valid syntax error 4

5 Agenda The role of intermediate representations Two example languages – A high-level language – An intermediate language Lowering Correctness – Formal meaning of programs 5

6 Role of intermediate representation Bridge between front-end and back-end Allow implementing optimizations independent of source language and executable (target) language High-level Language (scheme) Executable Code Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation 6

7 Motivation for intermediate representation 7

8 Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines 8 C++ IR Pentium Java bytecode Sparc Pyhton Java

9 Intermediate representation A language that is between the source language and the target language – Not specific to any source language of machine language Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer – Narrow interface: small number of node types (instructions) 9 C++ IR Pentium Java bytecode Sparc Pyhton Java optimize LoweringCode Gen.

10 Multiple IRs Some optimizations require high-level structure Others more appropriate on low-level code Solution: use multiple IR stages ASTLIR Pentium Java bytecode Sparc optimize HIR optimize 10

11 Multiple IRs example 11 Elixir Program Automated Reasoning (Boogie+Z3) Delta Inferencer QueryAnswer Elixir Program + delta Automated Planner IL Synthesizer Planning Problem Plan LIR C++ backend C++ code Galois Library HIR Lowering HIR Elixir – a language for parallel graph algorithms Mini-project on parallel graph algorithms

12 AST vs. LIR for imperative languages AST Rich set of language constructs Rich type system Declarations: types (classes, interfaces), functions, variables Control flow statements: if- then-else, while-do, break- continue, switch, exceptions Data statements: assignments, array access, field access Expressions: variables, constants, arithmetic operators, logical operators, function calls LIR An abstract machine language Very limited type system Only computation-related code Labels and conditional/ unconditional jumps, no looping Data movements, generic memory access statements No sub-expressions, logical as numeric, temporaries, constants, function calls – explicit argument passing 12

13 three address code 13

14 Three-Address Code IR A popular form of IR High-level assembly where instructions have at most three operands There exist other types of IR – For example, IR based on acyclic graphs – more amenable for analysis and optimizations 14 Chapter 8

15 Base language: While 15

16 Syntax A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 16 n  Numnumerals x  Varprogram variables

17 Example program 17 while x < y { x := x + 1 { y := x;

18 Intermediate language: IL 18

19 Syntax V  n | x R  V Op V Op  - | + | * | / | = |  | > | … C  l : skip | l : x := R | l : Goto l’ | l : IfZ x Goto l’ | l : IfNZ x Goto l’ IR  C + 19 n  NumNumerals l  Num Labels x  Temp  VarTemporaries and variables

20 Intermediate language programs An intermediate program P has the form 1:c 1 … n:c n We can view it as a map from labels to individual commands and write P(j) = c j 20 1: t0 := 137 2: y := t0 + 3 3: IfZ x Goto 7 4: t1 := y 5: z := t1 6: Goto 9 7: t2 := y 8: x := t2 9: skip

21 Lowering 21

22 TAC generation At this stage in compilation, we have – an AST – annotated with scope information – and annotated with type information To generate TAC for the program, we do recursive tree traversal – Generate TAC for any subexpressions and substatements – Using the result, generate TAC for the overall expression (bottom-up manner) 22

23 TAC generation for expressions Define a function cgen(expr) that generates TAC that computes an expression, stores it in a temporary variable, then hands back the name of that temporary Define cgen directly for atomic expressions (constants, this, identifiers, etc.) Define cgen recursively for compound expressions (binary operators, function calls, etc.) 23

24 Translation rules for expressions 24 cgen(n) = (l: t:=n, t)where l and t are fresh cgen(x) = (l: t:=x, t)where l and t are fresh cgen(e 1 ) = (P 1, t 1 ) cgen(e 2 ) = (P 2, t 2 ) cgen(e 1 op e 2 ) = (P 1 · P 2 · l: t:=t 1 op t 2, t) where l and t are fresh

25 cgen for basic expressions Maintain a counter for temporaries in c, and a counter for labels in l Initially: c = 0, l = 0 25 cgen(k) = { // k is a constant c = c + 1, l = l +1 Emit(l: tc := k) Return tc { cgen(id) = { // id is an identifier c = c + 1, l = l +1 Emit(l: t := id) Return tc {

26 Naive cgen for binary expressions cgen(e 1 op e 2 ) = { Let A = cgen(e 1 ) Let B = cgen(e 2 ) c = c + 1, l = l +1 Emit( l: tc := A op B; ) Return tc } 26 The translation emits code to evaluate e 1 before e 2. Why is that?

27 Example: cgen for binary expressions 27 cgen( (a*b)-d)

28 Example: cgen for binary expressions 28 c = 0, l = 0 cgen( (a*b)-d)

29 Example: cgen for binary expressions 29 c = 0, l = 0 cgen( (a*b)-d) = { Let A = cgen(a*b) Let B = cgen(d) c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc }

30 Example: cgen for binary expressions 30 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = cgen(a) Let B = cgen(b) c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = cgen(d) c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc }

31 Example: cgen for binary expressions 31 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code here A=t1

32 Example: cgen for binary expressions 32 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; here A=t1

33 Example: cgen for binary expressions 33 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; here A=t1

34 Example: cgen for binary expressions 34 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1

35 Example: cgen for binary expressions 35 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1 here A=t3

36 Example: cgen for binary expressions 36 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d here A=t1 here A=t3

37 Example: cgen for binary expressions 37 c = 0, l = 0 cgen( (a*b)-d) = { Let A = { Let A = {c=c+1, l=l+1, Emit(l: tc := a;), return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := b;), return tc } c = c + 1, l = l +1 Emit(l: tc := A * B; ) Return tc } Let B = {c=c+1, l=l+1, Emit(l: tc := d;), return tc } c = c + 1, l = l +1 Emit(l: tc := A - B; ) Return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d 5: t5:=t3-t4 here A=t1 here A=t3

38 cgen for statements We can extend the cgen function to operate over statements as well Unlike cgen for expressions, cgen for statements does not return the name of a temporary holding a value – (Why?) 38

39 Syntax A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 39 n  Numnumerals x  Varprogram variables

40 Translation rules for statements 40 cgen(e) = (P, t) cgen( x := e) = P · l: x :=t where l is fresh cgen( b ) = (Pb, t), cgen( S 1 ) = P 1, cgen( S 2 ) = P 2 cgen( if b then S 1 else S 2 ) = Pb IfZ t Goto l false P 1 l finish : Goto L after l false : skip P 2 l after : skip cgen( skip ) = l: skip where l is fresh where l finish, l false, l after are fresh

41 Translation rules for loops 41 cgen( b ) = (Pb, t), cgen( S ) = P cgen( while b S ) = l before : skip Pb IfZ t Goto l after P l loop : Goto L before l after : skip where l after, l before, l loop are fresh

42 Translation example 42 1: t1 := 137 2: t2 := 3 3: t3 := t1 + t2 4: y := t3 5: t4 := x 6: t5 := 0 7: t6 := t4=t5 8: IfZ t6 Goto 12 9: t7 := y 10: z := t7 11: Goto 14 12: t8 := y 13: x := t8 14: skip y := 137+3; if x=0 z := y; else x := y;

43 Correctness 43

44 Compiler correctness Intuitively, a compiler translates programs in one language (usually high) to another language (usually lower) such that they are bot equivalent Our goal is to formally define the meaning of this equivalence But first, we must define the meaning of a programming language 44

45 Formal semantics 45

46 46 http://www.daimi.au.dk/~bra8130/Wiley_book/wiley.html

47 What is formal semantics? 47 “Formal semantics is concerned with rigorously specifying the meaning, or behavior, of programs, pieces of hardware, etc.”

48 What is formal semantics? 48 “This theory allows a program to be manipulated like a formula – that is to say, its properties can be calculated.” Gérard Huet & Philippe Flajolet homage to Gilles Kahn

49 Why formal semantics? Implementation-independent definition of a programming language Automatically generating interpreters (and some day maybe full fledged compilers) Optimization, verification, and debugging – If you don’t know what it does, how do you know its correct/incorrect? – How do you know whether a given optimization is correct? 49

50 Operational semantics Elements of the semantics States/configurations: the (aggregate) values that a program computes during execution Transition rules: how the program advances from one configuration to another 50

51 Operational semantics of while 51

52 While syntax reminder A  n | x | A ArithOp A | ( A ) ArithOp  - | + | * | / B  true | false | A = A | A  A |  B | B  B | ( B ) S  x := A | skip | S ; S | { S } | if B then S else S | while B S 52 n  Numnumerals x  Varprogram variables

53 Semantic categories Z Integers {0, 1, -1, 2, -2, …} T Truth values { ff, tt } State Var  Z Example state:  =[ x  5, y  7, z  0] Lookup:  ( x) = 5 Update:  [ x  6] = [ x  6, y  7, z  0] 53

54 Semantics of expressions 54

55 Semantics of arithmetic expressions Semantic function  A  : State  Z Defined by induction on the syntax tree  n   = n  x   =  (x)  a 1 + a 2   =  a 1   +  a 2    a 1 - a 2   =  a 1   -  a 2    a 1 * a 2   =  a 1     a 2    (a 1 )   =  a 1   --- not needed  - a   = 0 -  a 1   Compositional Expressions in While are side-effect free 55

56 Arithmetic expression exercise Suppose  x = 3 Evaluate  x+1   56

57 Semantics of boolean expressions Semantic function  B  : State  T Defined by induction on the syntax tree  true   = tt  false   = ff  a 1 = a 2   =  a 1  a 2   =  b 1  b 2   =  b   = Compositional Expressions in While are side-effect free 57

58 Natural operating semantics Developed by Gilles Kahn [STACS 1987]STACS 1987 Configurations  S,  Statement S is about to execute on state   Terminal (final) state Transitions  S,    ’ Execution of S from  will terminate with the result state  ’ – Ignores non-terminating computations 58

59 Natural operating semantics  defined by rules of the form The meaning of compound statements is defined using the meaning immediate constituent statements 59  S 1,  1    1 ’, …,  S n,  n    n ’  S,    ’ if… premise conclusion side condition

60 Natural semantics for While 60  x := a,    [x  a   ] [ass ns ]  skip,    [skip ns ]  S 1,    ’,  S 2,  ’    ’’  S 1 ; S 2,    ’’ [comp ns ]  S 1,    ’  if b then S 1 else S 2,    ’ if  b   = tt [if tt ns ]  S 2,    ’  if b then S 1 else S 2,    ’ if  b   = ff [if ff ns ] axioms

61 Natural semantics for While 61  S,     ’,  while b S,  ’    ’’  while b S,    ’’ if  b   = tt [while tt ns ]  while b S,    if  b   = ff [while ff ns ] Non-compositional

62 Next lecture: Correctness of lowering


Download ppt "Compiler Principles Fall 2015-2016 Compiler Principles Lecture 6: Intermediate Representation Roman Manevich Ben-Gurion University of the Negev."

Similar presentations


Ads by Google