Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

Similar presentations


Presentation on theme: "1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ."— Presentation transcript:

1 1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.

2 2 Overview of the Subject (COMP 3438) Overview of Unix Sys. Prog. ProcessFile System Overview of Device Driver Development Character Device Driver Development Introduction to Block Device Driver Overview of Complier Design Lexical Analysis (HW #3) Syntax Analysis (HW #4) Part I: Unix System Programming (Device Driver Development) Part II: Compiler Design Course Organization (This lecture is in red)

3 Outline  Programming language: High-level vs. Low level  What is a compiler?  Phases of a compiler 3

4 Programming language – Machine Language  Machine languages Everything is a binary number Operations, data, addresses, … e.g. In MIPS 2000, 0010 0100 1010 0110 0000 0000 0000 0100 # $t5 + 4  $t6 Machines like it BUT not us 4

5 Programming language – Assembly Language  Assembly languages Symbolic representation of Machine Language e.g. Machine Code: 0010 0100 1010 0110 0000 0000 0000 0100 # $t5 + 4  $t6 Assembly Code: add $t6, $t5, 4 5

6 High-level Programming language  High-level languages Procedural (modular) programming Group instructions into meaningful abstractions, e.g., data types, control structures, functions, etc. C, Pascal, Perl Object oriented programming Group “data” and “methods” into “objects” Naturally represents the world around us C++, Java, JavaScript Logical programming: Prolog Functional programming: ML 6

7 Why High-level Languages?  Hide unnecessary details, so have a higher level of abstraction, increasing productivity  Make programs more robust, e.g., meaning of information is specified before its use, enabling substantial error checking at compile time  Make programs more portable 7

8 Compilers are Translators C/C++ Fortran Java Perl Matlab Natural Language Command Machine code Virtual Machine Code Transformed code (C, Java, …) Lower level commands Semantic components ……. Translate 8

9 Translation Mechanisms  Compilation To translate a source program in one language into an executable program in another language, and produce results while executing the new program Examples: C, C++, Fortune  Interpretation To read a source program and produce results while understanding that program Examples: Basic  Case Study: Java First, translate to java bytecode (compilation) Second, execute by interpretation (JVM)/compilation (JIT (Just-In-Time)) 9

10 Comparison of Compiler/Interpreter CompilerInterpreter Overview AdvantagesFast program execution; Fully exploit architecture features; Easy to debug; Flexible to modify; Machine independent; DisadvantagesPre-processing of program; Complicated; Execution overhead; Source Code Compiler Object Code Data Results Source Code Data Interpreter Results 10

11 What is a compiler?  A compiler is a software that takes a program written in one language (called the source language) and translates it into an equivalent program in another language (called the target language).  It also reports to its user the presence of errors in the source program. Compiler Source program Target program Error messages 11

12 The Phases of a Compiler Source program Lexical Analyzer Syntax Analyzer (Parser) Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target program Symbol-table Manager Error Handler 12

13  Scan the source program and group sequences of characters into tokens.  A token is the smallest element of a language a group of characters (e.g., a series of alphabetic characters forms a keyword; a series of digits forms a number).  The sub-module of the compiler that performs lexical analysis is called a lexical analyzer.  Example: position := initial + rate * 60 (pascal statement) Lexical Analysis Value Toke Type position ID rate ID := Operator * Operator initial ID 60 NUM 13

14  Once the tokens are identified, syntax analysis groups sequence of tokens into language constructs e.g., identifiers, numbers, and operators can be grouped into expressions. e.g., keywords, identifiers, expressions and operators can be combined to form statements.  The sub-module of the compiler that performs syntax analysis is called the parser/ Syntax Analyzer. Syntax Analysis 14

15  Result of syntax analysis is recorded in a hierarchical structure called a syntax tree, each node represents an operation and its children represent the arguments of the operation. evaluation begins from bottom and moves up. e.g., parse tree for postion := initial + rate * 60 Syntax Analysis – Syntax (Parse) Tree = id1 + id2* id3 NUM (60) 15

16 Semantic Analysis  Determine the meaning using the syntax tree Put semantic meaning into the syntax tree Perform checks to ensure that components fit together meaningful, e.g. Type checking = id1 + id2* id3 NUM (60) inttoreal 16

17 Intermediate Code Generation  Generate IR (Intermediate Representation) code temp1 := inttoreal(60) temp2 := id3*temp1 temp3 := id2+temp2 id1 := temp3 Easier to generate machine code from IR code = id1 + id2* id3 NUM (60) inttoreal 17

18  Code Optimization: Modify program representation so that program can run faster, use less memory, power, … IR Code Optimized Code Code Optimization temp1 := inttoreal(60) temp2 := id3*temp1 temp3 := id2+temp2 id1 := temp3 temp1 := id3* 60.0 id1 := id2+temp1 18

19 Code Generation  Generate target program. Machine Code temp1 := id3* 60.0 id1 := id2+temp1 MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1 19

20 Symbol Table Management  Collect and maintain information about ID Attributes: Storage: where to store (Data, Heap, Stack, …) Type: char, int, pointer, … Scope: effective range Number: value  Information is added and used by all phases  Debuggers use symbol table 20

21 Front End and Back End Source program Lexical Analyzer Syntax Analyzer (Parser) Semantic Analyzer Intermediate Code Generator Code Optimizer Code Generator Target program Symbol-table Manager Error Handler Front End Back End 21

22 Distinction between Phases and Passes  Passes: the times going through a program representation 1-pass, 2-pass, multiple-pass compilation Language become more complex – more passes  Phases: conceptual stages Not completely separate Semantic phase may do things that syntax should do 22

23 Compiler Tools Phases Tools Lexical Analysis Lex, flex Syntax Analysis yacc, bison Semantic Analysis Intermediate Code Code Optimization Code Generation 23


Download ppt "1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ."

Similar presentations


Ads by Google