Intermediate Representations Hal Perkins Autumn 2011

Slides:



Advertisements
Similar presentations
Intermediate Representations CS 671 February 12, 2008.
Advertisements

1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
Intermediate Representations Saumya Debray Dept. of Computer Science The University of Arizona Tucson, AZ
Program Representations. Representing programs Goals.
Compiler Construction Sohail Aslam Lecture IR Taxonomy IRs fall into three organizational categories 1.Graphical IRs encode the compiler’s knowledge.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
CPSC Compiler Tutorial 9 Review of Compiler.
Intermediate Representations Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
10/22/2002© 2002 Hal Perkins & UW CSEG-1 CSE 582 – Compilers Intermediate Representations Hal Perkins Autumn 2002.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Intermediate Code CS 471 October 29, CS 471 – Fall Intermediate Code Generation Source code Lexical Analysis Syntactic Analysis Semantic.
1 Intermediate representation Goals: encode knowledge about the program facilitate analysis facilitate retargeting facilitate optimization scanning parsing.
8/19/2015© Hal Perkins & UW CSEC-1 CSE P 501 – Compilers Parsing & Context-Free Grammars Hal Perkins Winter 2008.
Precision Going back to constant prop, in what cases would we lose precision?
CSE P501 – Compiler Construction Parser Semantic Actions Intermediate Representations AST Linear Next Spring 2014 Jim Hogg - UW - CSE - P501G-1.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
10/1/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2009.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Introduction CPSC 388 Ellen Walker Hiram College.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Intermediate Code Representations
12/18/2015© Hal Perkins & UW CSEG-1 CSE P 501 – Compilers Intermediate Representations Hal Perkins Winter 2008.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404 Introduction to Compiler Design
Compiler Design (40-414) Main Text Book:
Basic Program Analysis
Chapter 1 Introduction.
CS 3304 Comparative Languages
A Simple Syntax-Directed Translator
Introduction to Parsing (adapted from CS 164 at Berkeley)
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 1 Introduction.
-by Nisarg Vasavada (Compiled*)
Ch. 4 – Semantic Analysis Errors can arise in syntax, static semantics, dynamic semantics Some PL features are impossible or infeasible to specify in grammar.
课程名 编译原理 Compiling Techniques
Compiler Lecture 1 CS510.
CS 536 / Fall 2017 Introduction to programming languages and compilers
Compiler Construction
Basic Program Analysis: AST
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
Course supervisor: Lubna Siddiqui
Lecture 2: General Structure of a Compiler
Intermediate Representations
Chapter 6 Intermediate-Code Generation
Intermediate Representations
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Compiler design.
Exam Topics Hal Perkins Autumn 2009
Intermediate Representations Hal Perkins Autumn 2005
8 Code Generation Topics A simple code generator algorithm
Intermediate Code Generation
Parsing & Context-Free Grammars Hal Perkins Summer 2004
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
Target Code Generation
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
Intermediate Code Generating machine-independent intermediate form.
Exam Topics Hal Perkins Winter 2008
Presentation transcript:

Intermediate Representations Hal Perkins Autumn 2011 CSE P 501 – Compilers Intermediate Representations Hal Perkins Autumn 2011 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Agenda Parser Semantic Actions Intermediate Representations Abstract Syntax Trees (ASTs) Linear Representations & more 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Compiler Structure (review) IR Parser Middle (optimization) tokens IR (maybe different) Scanner Code Gen characters Assembly or binary code Source Target 11/14/2018 © 2002-11 Hal Perkins & UW CSE

What’s a Parser to Do? Idea: at significant points in the parse perform a semantic action Typically when a production is reduced (LR) or at a convenient point in the parse (LL) Typical semantic actions Build (and return) a representation of the parsed chunk of the input (compiler) Perform some sort of computation and return result (interpreter) 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Intermediate Representations In most compilers, the parser builds an intermediate representation of the program Rest of the compiler transforms the IR to improve (“optimize”) it and eventually translates it to final code Often will transform initial IR to one or more different IRs along the way Some general examples now; specific examples as we cover later topics 11/14/2018 © 2002-11 Hal Perkins & UW CSE

IR Design Decisions affect speed and efficiency of the rest of the compiler Desirable properties Easy to generate Easy to manipulate Expressive Appropriate level of abstraction Different tradeoffs depending on compiler goals Different tradeoffs in different parts of the same compiler Au02: taken from Cooper’s slides 11/14/2018 © 2002-11 Hal Perkins & UW CSE

IR Design Taxonomy Structure Abstraction Level Graphical (trees, DAGs, etc.) Linear (code for some abstract machine) Hybrids are common (e.g., control-flow graphs) Abstraction Level High-level, near to source language Low-level, closer to machine 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Levels of Abstraction Key design decision: how much detail to expose Affects possibility and profitability of various optimizations Structural IRs are typically fairly high-level Linear IRs are typically low-level But these generalizations don’t always hold 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Examples: Array Reference A[i,j] or t1  A[i,j] loadI 1 => r1 sub rj,r1 => r2 loadI 10 => r3 mult r2,r3 => r4 sub ri,r1 => r5 add r4,r5 => r6 loadI @A => r7 add r7,r6 => r8 load r8 => r9 subscript A i j Au02: still borrowing from Cooper’s slides 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Structural IRs Typically reflect source (or other higher-level) language structure Tend to be large Examples: syntax trees, DAGs Generally used in early phases of compilers Au02: based on Cooper’s slides 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Concrete Syntax Trees The full grammar is needed to guide the parser, but contains many extraneous details Chain productions Rules that control precedence and associativity Typically the full syntax tree does not need to be used explicitly 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Syntax Tree Example Concrete syntax for x=2*(n+m); expr ::= expr + term | expr – term | term term ::= term * factor | term / factor | factor factor ::= int | id | ( expr ) Syntax Tree Example Concrete syntax for x=2*(n+m); 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Abstract Syntax Trees Want only essential structural information Omit extraneous junk Can be represented explicitly as a tree or in a linear form Example: LISP/Scheme S-expressions are essentially ASTs Common output from parser; used for static semantics (type checking, etc.) and high-level optimizations Usually lowered for later compiler phases 11/14/2018 © 2002-11 Hal Perkins & UW CSE

AST Example AST for x=2*(n+m); 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Directed Acyclic Graphs DAGs are often used to identify common subexpressions Not necessarily a primary representation, compiler might build dag then translate back after some code improvement Leaves = operands Interior nodes = operators 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Expression DAG example DAG for a + a * (b – c) + (b – c) * d 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Linear IRs Pseudo-code for some abstract machine Level of abstraction varies Simple, compact data structures Commonly used: arrays, linked structures Examples: three-address code, stack machine code 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Abstraction Levels in Linear IR Linear IRs can also be close to the source language, very low-level, or somewhere in between. Example: Linear IRs for C array reference a[i][j+2] (from Muchnick, sec. 4.2) High-level: t1  a[i,j+1] 11/14/2018 © 2002-11 Hal Perkins & UW CSE

IRs for a[i,j+2], cont. Medium-level Low-level t1  j + 2 t2  i * 20 t3  t1 + t2 t4  4 * t3 t5  addr a t6  t5 + t4 t7  *t6 Low-level r1  [fp-4] r2  r1 + 2 r3  [fp-8] r4  r3 * 20 r5  r4 + r2 r6  4 * r5 r7  fp – 216 f1  [r7+r6] Main difference: medium-level retains basic symbolic information about variables and computes addresses in terms of variables; low-level makes all memory references and calculations explicit, exposing all details of the low-level layout. 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Abstraction Level Tradeoffs High-level: good for source optimizations, semantic checking Low-level: need for good code generation and resource utilization in back end; many optimizing compilers work at this level for middle/back ends Medium-level: fine for optimization and most other middle/back-end purposes 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Three-Address code Usual form: x  y (op) z One operator Maximum of three names Example: x=2*(n+m); becomes t1  n + m t2  2 * t1 x  t2 Invent as many new temporary names as neeeded 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Three Address Code Advantages Various representations Resembles code for actual machines Explicitly names intermediate results Compact Often easy to rearrange Various representations Quadruples, triples, SSA We will see much more of this… 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Stack Machine Code Originally used for stack-based computers (famous example: B5000, ca 1961 et seq) Now used for Java (.class files), C# (MSIL), others Advantages Very compact; mostly 0-address opcodes Easy to generate Simple to translate to machine code or interpret directly And a good starting point for generating optimized code 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Stack Code Example Hypothetical code for x=2*(n+m); pushaddr x pushconst 2 pushval n pushval m add mult store 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Hybrid IRs Combination of structural and linear Level of abstraction varies Most common example: control-flow graph Nodes: basic blocks Edge from B1 to B2 if execution can flow from B1 to B2 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Basic Blocks Fundamental unit in IRs Definition: a basic block is a maximal sequence of instructions entered at the first instruction and exited at the last i.e., if the first instruction is executed, all of them will be (modulo exceptions) 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Identifying Basic Blocks Easy to do with a scan of the linear instruction stream A basic blocks begins at each instruction that is: The beginning of a routine The target of a branch Immediately following a branch or return 11/14/2018 © 2002-11 Hal Perkins & UW CSE

What IR to Use? Common choice: all(!) AST or other structural representation built by parser and used in early stages of the compiler Closer to source code Good for semantic analysis Facilitates some higher-level optimizations Lower to linear IR for later stages of compiler Closer to machine code Exposes machine-related optimizations Use to build control-flow graph 11/14/2018 © 2002-11 Hal Perkins & UW CSE

Coming Attractions Representing ASTs Working with ASTs Where do the algorithms go? Is it really object-oriented? (Does it matter?) Visitor pattern Then: semantic analysis, type checking, and symbol tables 11/14/2018 © 2002-11 Hal Perkins & UW CSE