Chapter 1. Overview J. H. Wang Sep.15, 2015. Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler.

Slides:



Advertisements
Similar presentations
CS 31003: Compilers Introduction to Phases of Compiler.
Advertisements

1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
CPSC Compiler Tutorial 9 Review of Compiler.
Yu-Chen Kuo1 Chapter 1 Introduction to Compiling.
Compilers Book: Crafting a Compiler with C
Reference Book: Modern Compiler Design by Grune, Bal, Jacobs and Langendoen Wiley 2000.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
Compiler Construction1 A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University First Semester 2009/2010.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Lecture 2 Phases of Compiler. Preprocessors, Compilers, Assemblers, and Linkers Preprocessor Compiler Assembler Linker Skeletal Source Program Source.
CPSC 388 – Compiler Design and Construction Lecture: MWF 11:00am-12:20pm, Room 106 Colton.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Course Revision Contents  Compilers  Compilers Vs Interpreters  Structure of Compiler  Compilation Phases  Compiler Construction Tools  A Simple.
COP4020 Programming Languages
Chapter 1. Introduction.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Introduction to Compiler Construction Robert van Engelen COP5621 Compiler Construction Copyright Robert.
Compiler Construction1 COMP Compiler Construction Lecturer: Dr. Arthur Cater Teaching Assistant:
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
CST320 - Lec 11 Why study compilers? n n Ties lots of things you know together: –Theory (finite automata, grammars) –Data structures –Modularization –Utilization.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
1 Chapter 1 Introduction. 2 Outlines 1.1 Overview and History 1.2 What Do Compilers Do? 1.3 The Structure of a Compiler 1.4 The Syntax and Semantics of.
Compiler design Lecture 1: Compiler Overview Sulaimany University 2 Oct
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Topic #1: Introduction EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
Introduction to Compiling
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
CSC 4181 Compiler Construction
©SoftMoore ConsultingSlide 1 Structure of Compilers.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
COP4020 Programming Languages Introduction Prof. Robert van Engelen (modified by Prof. Em. Chris Lacher)
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
CS510 Compiler Lecture 1. Sources Lecture Notes Book 1 : “Compiler construction principles and practice”, Kenneth C. Louden. Book 2 : “Compilers Principles,
Programming Languages Concepts Chapter 1: Programming Languages Concepts Lecture # 4.
Chapter 1 Introduction Samuel College of Computer Science & Technology Harbin Engineering University.
Chapter 1. Introduction.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Advanced Computer Systems
Compiler Design (40-414) Main Text Book:
PRINCIPLES OF COMPILER DESIGN
Chapter 1 Introduction.
Introduction to Compiler Construction
A Simple Syntax-Directed Translator
Compiler Construction (CS-636)
Chapter 1 Introduction.
课程名 编译原理 Compiling Techniques
Compiler Lecture 1 CS510.
Compiler Construction
Introduction to Compiler Construction
Course supervisor: Lubna Siddiqui
Introduction CI612 Compiler Design CI612 Compiler Design.
Compiler 薛智文 TH 6 7 8, DTH Spring.
Compiler 薛智文 TH 6 7 8, DTH Spring.
Introduction to Compiler Construction
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler 薛智文 M 2 3 4, DTH Spring.
Introduction to Compiler Construction
Presentation transcript:

Chapter 1. Overview J. H. Wang Sep.15, 2015

Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler Programming Language and Compiler Design Computer Architecture and Compiler Design Compiler Design Considerations Integrated Development Environments

Language Processors Translators –Transforming human-oriented programming languages into computer-oriented machine languages

History of Compilation Early compilers –1950s: by Grace Hopper –Late 1950s: Fortran Broad applications –Typesetting: TeX, LaTeX –Portable document representation: PostScript –Symbolic and numeric problem solving: Mathematica –VLSI: Verilog, VHDL

What Compilers Do Compilers may be distinguished in two ways –By the kind of machine code they generate –By the format of the target code they generate

Machine Code Generated by Compilers Pure machine code –Only instructions from a particular instruction set Without dependence on any software (library, OS) –Rare; mostly used in system implementation languages Augmented machine code –Augmented with OS and runtime language support routines I/O, storage allocation, mathematical functions Data transfer, procedure call, and dynamic storage instructions –More often Virtual machine code –Only virtual instructions –Virtual machine Pascal P-code Java bytecodes –Portability, program size reduction

Bootstrapping

Target Code Formats Assembly or other source formats –Easy to scrutinize –Useful for prototyping programming language designs and cross-compilation Relocatable binary –More efficient and more control over the translation process –External references, local instruction addresses, and data addressed are not bound A linkage step is required Absolute binary –Faster, but limited ability to interface with other code –Useful for exercises and prototyping Compilation costs far exceed execution costs

Interpreters

Capabilities of interpreters –Programs can be easily modified as execution proceeds Interactive debugging –Dynamic object typing can be easily supported E.g. Lisp and Scheme –Significant degree of machine independence Drawbacks –Direct interpretation of source programs can involve significant overhead

Syntax and Semantics Syntax: structure –E.g. context-free grammars (CFGs) a=b+c is legal, but b+c=a is not Semantics: meaning –E.g. a=b+c is illegal if any of the variables are undeclared or if b or c is of type Boolean –Static semantics –Runtime semantics

Static Semantics A set of rules that specify which syntactically legal programs are actually valid –E.g.: Identifier declaration, type-compatibility of operators and operands, proper number of parameters in procedure calls Can be specified either formally or informally –E.g.: attribute grammars

An Example of Attribute Grammars Production rule: –E -> E+T Augmented production rule: –E result -> E v1 + T v2 if v1.type =numeric and v2.type =numeric then result.type <-numeric else call ERROR() –Verbose and tedious

Runtime Semantics To specify what a program computes –Can be specified informally E.g.: program states –a=1: the state component corresponding to a is changed to 1 –Formal approaches Natural semantics: operational model –Given assertions before evaluations of a construct, we can infer assertions that will hold after the construct’s evaluation Axiomatic semantics: relations or predicates that relate program variables –E.g.: var <- exp » var is true after statement execution iff. the predicate obtained by replacing all occurrences of var by exp is true beforehand –Good for deriving proofs of program correctness; but difficult to use Denotational semantics: more mathematical in form –E.g: E[T1+T2]m=E[T1]m+E[T2]m

Difficulty in semantics: imprecise language specification –E.g.: (in Java) public static int subr(int b) { if (b != 0) return b+100; } public static int subr(int b) { if (b != 0) return b+100; else if (10*b==0) return 1; } –The problem of deciding whether a particular statement in a program is reachable is undecidable In practice, a trusted reference compiler can serve as a de facto language definition –E.g.: Lisp

Organization of a Compiler Analysis Synthesis

The Structure of a Compiler Tasks performed by compilers – Analysis of the source program Syntax analysis Semantic analysis – Synthesis of a target program that, when executed, will correctly perform the computations described by the source program Code generator Optimizer

The Scanner Reading the input text and grouping individual characters into tokens –Identifiers –Integers –Reserved words –Delimiters What the scanner does –It puts the program into a compact and uniform format –It eliminates unneeded information –It processes compiler control directives –It sometimes enters preliminary information into symbol table –It optionally formats and lists the source program

Lexical Analysis (Scanning) [Aho, Lam, Sethi, Ullman] Grouping characters into lexemes Producing tokens –(token-name, attribute-value) E.g. –position = initial + rate * 60 –

Regular expressions (Chap. 3) –An effective and powerful approach to describe tokens –As a specification for automatic generation of finite automata that recognizes regular sets Scanner generator

The Parser Reading tokens and grouping them into phrases according to the syntax specification such as CFGs –Grammars (Chap. 2 & 4) –Parsing (Chap. 5 & 6) –Parser generator It usually builds an Abstract Syntax Tree (AST) as a concise representation of program structure –(Chap. 2 & 7)

Syntax Analysis (Parsing) [Aho, Lam, Sethi, Ullman] Creating a tree-like intermediate representation (e.g. syntax tree) that depicts the grammatical structure of the token streams –E.g. – = + * 60

The Type Checker (Semantic Analysis) Checking the static semantics of each AST node –If the construct is semantically correct, the type checker decorates the AST node by adding type information to it –Otherwise, a suitable error message is issued

Semantic Analysis [Aho, Lam, Sethi, Ullman] Type checking Type conversions or coercions E.g. – = + * 60 int2float

Translator (Program Synthesis) Translating AST nodes into Intermediate Representation (IR) code –E.g. while loops -> two subtrees: expression, body It’s largely dictated by the semantics of the source language In simple, nonoptimizing compilers, the translator may generate target code directly More elaborate compilers such as GCC may first generate a high-level IR and then translate it into a low-level IR

Intermediate Code Generation [Aho, Lam, Sethi, Ullman] Generating a low-level intermediate representation –It should be easy to produce –It should be easy to translate into the target machine –E.g. three-address code (in Chap. 6) t1 = int2float(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3

Symbol Tables A mechanism that allows information to be associated with identifiers and shared among compiler phases –Identifier declaration –Identifier use –Type checking

Symbol Table Management [Aho, Lam, Sethi, Ullman] To record the variable names and collect information about various attributes of each name –Storage, type, scope –Number and types of arguments, method of argument passing, and the type returned NameType position… initial… rate…

The Optimizer Analyzing and transforming the IR code generated by the translator into functionally equivalent but improved code –Complex –Optimizations may be performed in stages Optimization can also be done after code generation –E.g. peephole optimization: a few instructions at a time Multiplications by 1 Additions of 0 Loading a value into register when it’s already in another register Replacing a sequence of instructions by a single instruction with the same effect

Code Optimization [Aho, Lam, Sethi, Ullman] Attempts to improve the intermediate code –Better: faster, shorter code, or code that consumes less power –E.g. t1 = id3 * 60.0 id1 = id2 + t1

The Code Generator Mapping the IR code generated by the translator into target machine code –Machine-dependent, complex Register allocation Code scheduling Automatic construction of code generators has been actively studied –Matching a low-level IR to target-instruction templates –This makes it easy to retarget a compiler to a new target machine E.g. GCC

Code Generation [Aho, Lam, Sethi, Ullman] Mapping intermediate representation of the source program into the target language –Machine code: register/memory location assignments –E.g. LDF R2, id3 MULF R2, R2, #60.0 LDF R1, id2 ADDF R1, R1, R2 STF id1, R1

Phases of a Compiler [Aho, Lam, Sethi, Ullman] Syntax Analyzer character stream target machine code Lexical Analyzer Intermediate Code Generator Code Generator token stream syntax tree intermediate representation Symbol Table Semantic Analyzer syntax tree Machine-Independent Code Optimization Machine-Dependent Code Optimization (optional)

Compiler Writing Tools Compiler generators (compiler compilers) –Scanner generator –Parser generator –Symbol table manager –Attribute grammar evaluator –Code-generation tools Much of the effort in crafting a compiler lies in writing and debugging the semantic phases –Usually hand-coded

Programming Language and Compiler Design Many compiler techniques arise from the need to cope with some programming language construct The state of the art in compiler design also strongly affects programming language design The advantages of a programming language that’s easy to compile: –Easier to learn, read, understand –Have quality compilers on a wide variety of machines –Better code will be generated –Fewer compiler bugs –The compiler will be smaller, cheaper, faster, more reliable, and more widely used –Better diagnostic messages and program development tools

Computer Architecture and Compiler Design Compiler designers are responsible for making computing capability available to programmers Problems –Instruction sets for some popular architectures are highly nonuniform –High-level programming language operations are not always easy to support –Essential architectural features such as hardware caches and distributed processors and memory are difficult to present to programmers in an architecturally independent manner –Effective use of a large number of processors has always posed challenges to application developers and compiler writers –For some programming languages, runtime checks for data and program integrity are dropped in favor of gains in execution speed

Compiler Design Considerations Debugging (development) compilers –Detailing programmer errors –E.g. CodeCenter –It can often tolerate or repair minor errors (e.g. inserting a missing comma or parenthesis) Optimizing compilers (Chap. 13 & 14) –Producing efficient target code at the cost of increased compiler complexity and increased compilation times –Optimal code, even when theoretically possible, is often infeasible in practice –A variety of transformations might interfere with each other Retargetable compilers (Chap. 11 & 13) –Target architecture can be changed without its machine- independent components having to be rewritten –More difficult to write, but development costs can be shared

Integrated Development Environments To integrate program development cycle into a single framework –Editing, compilation, testing, debugging Immediate feedback on syntax and semantic problems Focus on source program Providing easy access to information about the program Many of the techniques in batch compilation can be reformulated into incremental form to support IDEs –Parser, type checker, … In this book, we concentrate on the translation of C, C++, Java

End of Chapter 1 Any Questions or Comments?