Chapter 9 Compilers and Language Translation. The Compilation Process Phase I: Lexical analysis Phase I: Lexical analysis Phase II: Parsing Phase II:

Slides:



Advertisements
Similar presentations
Programming Languages Third Edition Chapter 6 Syntax.
Advertisements

Compilers and Language Translation
UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
CPSC Compiler Tutorial 9 Review of Compiler.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Invitation to Computer Science 5th Edition
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
CSI 3120, Grammars, page 1 Language description methods Major topics in this part of the course: –Syntax and semantics –Grammars –Axiomatic semantics (next.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Syntax: 10/18/2015IT 3271 Semantics: Describe the structures of programs Describe the meaning of programs Programming Languages (formal languages) -- How.
PART I: overview material
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Lexical and Syntax Analysis
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 3 Part II Describing Syntax and Semantics.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
The Functions and Purposes of Translators Syntax (& Semantic) Analysis.
Introduction to Compiling
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Chapter 1 Introduction Major Data Structures in Compiler
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
ISBN Chapter 3 Describing Syntax and Semantics.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
CSC 4181 Compiler Construction
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Advanced Computer Systems
Compiler Design (40-414) Main Text Book:
Introduction to Compiler Construction
System Software Unit-1 (Language Processors) A TOY Compiler
A Simple Syntax-Directed Translator
PROGRAMMING LANGUAGES
-by Nisarg Vasavada (Compiled*)
Compiler Lecture 1 CS510.
Compiler Design 4. Language Grammars
Lexical and Syntax Analysis
R.Rajkumar Asst.Professor CSE
High-Level Programming Language
Chapter 10: Compilers and Language Translation
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Presentation transcript:

Chapter 9 Compilers and Language Translation

The Compilation Process Phase I: Lexical analysis Phase I: Lexical analysis Phase II: Parsing Phase II: Parsing Phase III: Semantics and code generation Phase III: Semantics and code generation Phase IV: Code Optimization Phase IV: Code Optimization

Introduction High-level languages are more difficult to “ translate ” than assembly languages. High-level languages are more difficult to “ translate ” than assembly languages. Assembly language and machine language are related 1-to-1. Assembly language and machine language are related 1-to-1. The relationship between a high-level language and machine language is 1- to-many. The relationship between a high-level language and machine language is 1- to-many.

Compiler The piece of software that translates high-level programming language codes into machine language codes. The piece of software that translates high-level programming language codes into machine language codes. Two distinct goals of compiler: Two distinct goals of compiler: CorrectnessCorrectness Efficient and concise Example: 2x 0 +2x 1 + … +2x 50000Efficient and concise Example: 2x 0 +2x 1 + … +2x 50000

Object file The Compilation Process ScannerParser Code Generator Optimizer

Lexical Analysis The compiler examines the individual characters in the source program and groups them into syntactical units, called tokens, that will be analyzed in succeeding stages. The compiler examines the individual characters in the source program and groups them into syntactical units, called tokens, that will be analyzed in succeeding stages. Analogous to grouping letters into words prior to analyzing text. Analogous to grouping letters into words prior to analyzing text.

Parsing During this stage the sequence of tokens formed by the scanner is checked to see whether it is syntactically correct according to the rules of the programming language. During this stage the sequence of tokens formed by the scanner is checked to see whether it is syntactically correct according to the rules of the programming language. Equivalent to checking whether the words in the text form grammatically correct sentences. Equivalent to checking whether the words in the text form grammatically correct sentences.

Semantic Analysis and Code Generation If the high-level language statement is structurally correct, then the compiler analyzes its meaning and generates the proper sequence of machine language instructions to carry out these actions. If the high-level language statement is structurally correct, then the compiler analyzes its meaning and generates the proper sequence of machine language instructions to carry out these actions.

Code Optimization The compiler takes the generated code and see whether it can be made more efficient, either by making it run faster, or having it occupy less memory. The compiler takes the generated code and see whether it can be made more efficient, either by making it run faster, or having it occupy less memory.

Phase I: Lexical Analysis Scanner, or lexical analyzer, groups input characters into tokens. Scanner, or lexical analyzer, groups input characters into tokens. Example: a = b delta; Example: a = b delta; The scanner discards nonessential characters, such as blanks and tabs, and the group the remaining characters into high-level syntactic symbols such as symbols, numbers, and operators. The scanner discards nonessential characters, such as blanks and tabs, and the group the remaining characters into high-level syntactic symbols such as symbols, numbers, and operators.

Token Classifications Token typeClassification number Token typeClassification number symbol1 number2 Others: =(3),+(4),-(5),;(6); ==(7), if(8), else (9), ( 10, ) 11 Others: =(3),+(4),-(5),;(6); ==(7), if(8), else (9), ( 10, ) 11

Phase II: Parsing During the parsing phase, a compiler determines whether the tokens recognized by the scanner fit together in a grammatically meaningful way. During the parsing phase, a compiler determines whether the tokens recognized by the scanner fit together in a grammatically meaningful way. Analogous to the operation of “ diagramming a sentence ”. Analogous to the operation of “ diagramming a sentence ”.

Example To prove the sequence of words: To prove the sequence of words: The man bit the dog is a correctly formed sentence. is a correctly formed sentence.

Another Example The man bit the

Programming Language Example Statement: a = b + c Statement: a = b + c

Parse Tree The structure shown in the previous example is called a parse tree. The structure shown in the previous example is called a parse tree. It starts from the individual tokens a,=,b,+,c and show how these tokens can be grouped together into predefined grammatical categories such as, and until the desired goal is reached. (in this case, ) It starts from the individual tokens a,=,b,+,c and show how these tokens can be grouped together into predefined grammatical categories such as, and until the desired goal is reached. (in this case, )

Grammars, Languages and BNF How does a parser know how to construct the parse tree? How does a parser know how to construct the parse tree? The parser must be given a formal description of the syntax, the grammatical structure, of the language that it is going to analyze. The parser must be given a formal description of the syntax, the grammatical structure, of the language that it is going to analyze. Most widely used notation for representing the syntax of programming language is called BNF, an acronym for Backus-Naur form. Most widely used notation for representing the syntax of programming language is called BNF, an acronym for Backus-Naur form.

BNF The syntax of a language is specified as a set of rules, also called productions. The syntax of a language is specified as a set of rules, also called productions. The entire collection of rules is called a grammar. The entire collection of rules is called a grammar. BRN rule: left-hand side::=“definition” BRN rule: left-hand side::=“definition”

BNF Example ::= = ::= = The rule says that the syntactical construct called is defined as a followed by the token = followed by the syntactical construct called The rule says that the syntactical construct called is defined as a followed by the token = followed by the syntactical construct called

Terminal/Nonterminals BNF uses two types of objects on the right hand side of a productions: BNF uses two types of objects on the right hand side of a productions: Terminals: actual tokens of the language recognized and returned by a scanner.Terminals: actual tokens of the language recognized and returned by a scanner. Nonterminals: an intermediate grammatical category used to help explain and organize the language.Nonterminals: an intermediate grammatical category used to help explain and organize the language.

Goal Symbol The goal symbol is the highest-level nonterminal. The goal symbol is the highest-level nonterminal. When goal symbol has been produced, the parser has finished building the tree, and the statements have been successfully parsed. When goal symbol has been produced, the parser has finished building the tree, and the statements have been successfully parsed. The collection of all statements that can be successfully parsed is called the language defined by a grammar. The collection of all statements that can be successfully parsed is called the language defined by a grammar.

Meta-symbols Meta-symbol: used to describe the characteristics of another language. Meta-symbol: used to describe the characteristics of another language. BNF has five meta-symbols: BNF has five meta-symbols:<>::= | :OR, Ex: :=0|1|2|3|4|5|6|7|8|9  : null string Ex: := := +|-|

Fundamental Rule of Parsing If, by repeated applications of the rules of the grammar, a parser can convert the sequence of input tokens into the goal symbol, then that sequence of tokens is a syntactically valid statement of the language. If, by repeated applications of the rules of the grammar, a parser can convert the sequence of input tokens into the goal symbol, then that sequence of tokens is a syntactically valid statement of the language.

Example A three-rule grammar A three-rule grammar 1. ::= 1. ::= 2. ::= bees|dogs 3. ::=buzz|bite Example 1: Dogs bite. Example 1: Dogs bite. Example 2: Bees dogs. Example 2: Bees dogs.

Another Example Grammar for a simplified assignment statement Grammar for a simplified assignment statement 1. ::= = 1. ::= = 2. ::= | + 2. ::= | + 3. ::= x|y|z

Generated Parse Tree

Wrong Path

How to parse? The process of parser is a complex sequence of applying rules, building grammatical constructs, seeing whether things are moving toward the correct answer (the goal symbol). If not, “ undo ” the rule just applied and try another. The process of parser is a complex sequence of applying rules, building grammatical constructs, seeing whether things are moving toward the correct answer (the goal symbol). If not, “ undo ” the rule just applied and try another. Look-ahead parsing algorithm: “ looking down the road ” a few tokens to see what would happen if a certain choice were made. Look-ahead parsing algorithm: “ looking down the road ” a few tokens to see what would happen if a certain choice were made.

Example Not possible to build a parse tree with the grammar.

Major Challenge Design a grammar that: Design a grammar that: Includes every valid statement that we want to be in the languageIncludes every valid statement that we want to be in the language Excludes every invalid statement that we do not want to be in the languageExcludes every invalid statement that we do not want to be in the language

Assignment Statement (2 nd try) 1. ::= = 1. ::= = 2. ::= | + (recursive definition) 3. ::= x|y|z

Resulting Parse Tree

Using Recursive Definition

Validity vs. Ambiguity It is possible to construct two parse trees of x=x+y+z using the 2 nd grammar.  Two different meanings. It is possible to construct two parse trees of x=x+y+z using the 2 nd grammar.  Two different meanings. X=(x+y)+zx=x+(y+z) X=(x+y)+zx=x+(y+z)

If-else grammar

Parse Tree

Phase III: Semantics and Code Generation 1. ::= 1. ::= 2. ::= bees|dogs 3. ::=buzz|bite Possible combinations: Possible combinations: Dogs bite.Dogs bite. Dogs bark.Dogs bark. Bees bite.Bees bite. Bees bark.Bees bark. Not all combinations make sense. Not all combinations make sense.

Semantics and Code Generation A compiler examines the semantics of a programming language statement. It analyzes the meaning of the tokens and tries to understand the actions they perform. A compiler examines the semantics of a programming language statement. It analyzes the meaning of the tokens and tries to understand the actions they perform. If the statement is meaningless, it is semantically rejected. Otherwise it is translated into machine language. If the statement is meaningless, it is semantically rejected. Otherwise it is translated into machine language.

Example The statement sum=a+b; The statement sum=a+b; is syntactically correct. But what if the variables are defined as follows: But what if the variables are defined as follows: char a; double b; int sum;

Semantic Records Each nonterminal symbol is associated with a semantic record, a data structure that stores information about a nonterminal, such as the actual name of the object and its data type. Each nonterminal symbol is associated with a semantic record, a data structure that stores information about a nonterminal, such as the actual name of the object and its data type.

Semantic Records (II) Grows gradually. Grows gradually.

Another Situation

Two-Stage Process Semantic analysis: a pass over the parse tree to determine whether all branches of the tree are semantically valid. Semantic analysis: a pass over the parse tree to determine whether all branches of the tree are semantically valid. Code generation: the compiler makes a 2 nd pass over the parse tree to produce the translated code. Code generation: the compiler makes a 2 nd pass over the parse tree to produce the translated code.

Example

Example (cont’d)

Code Optimization To make the code more efficient: To make the code more efficient: Local optimizationLocal optimization Global optimizationGlobal optimization Different from programmer optimization with compiler tools such as: Different from programmer optimization with compiler tools such as: Visual development environmentsVisual development environments On-line debuggersOn-line debuggers Reusable code librariesReusable code libraries

Local Optimization Look at a very small block of instructions and try to improve it. Look at a very small block of instructions and try to improve it. Possible approaches Possible approaches Constant evaluation: x=1+1;Constant evaluation: x=1+1; Strength reduction: x=x*2;Strength reduction: x=x*2; Eliminating unnecessary operationsEliminating unnecessary operations

Global Optimization Look at large segments of program and decide how to improve performance. Look at large segments of program and decide how to improve performance. A much harder problem. A much harder problem.