Compilers and Language Translation

Slides:



Advertisements
Similar presentations
Programming Languages Third Edition Chapter 6 Syntax.
Advertisements

Software & Services Group, Developer Products Division Copyright© 2010, Intel Corporation. All rights reserved. *Other brands and names are the property.
報告者:會計四 簡思佳 The process of converting a program written in a high-level language into a machine-executable form. language implementation Ex.C++
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
CPSC Compiler Tutorial 9 Review of Compiler.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 9 Compilers and Language Translation. The Compilation Process Phase I: Lexical analysis Phase I: Lexical analysis Phase II: Parsing Phase II:
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
UMBC Introduction to Compilers CMSC 431 Shon Vick 01/28/02.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
Compiled by Benjamin Muganzi 3.2 Functions and Purposes of Translators Computing 9691 Paper 3 1.
Invitation to Computer Science 5th Edition
The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. from Concepts of Programming Languages, 9th edition by Robert W. Sebesta,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
CSC 338: Compiler design and implementation
Compiler course 1. Introduction. Outline Scope of the course Disciplines involved in it Abstract view for a compiler Front-end and back-end tasks Modules.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 2.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Lexical and Syntax Analysis
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
CPS 506 Comparative Programming Languages Syntax Specification.
Chapter 1 Introduction. Chapter 1 - Introduction 2 The Goal of Chapter 1 Introduce different forms of language translators Give a high level overview.
Overview of Previous Lesson(s) Over View  A program must be translated into a form in which it can be executed by a computer.  The software systems.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
Chapter 1 Introduction Study Goals: Master: the phases of a compiler Understand: what is a compiler Know: interpreter,compiler structure.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
The Functions and Purposes of Translators Syntax (& Semantic) Analysis.
Introduction to Compiling
Chapter 1 Introduction Major Data Structures in Compiler
Compiler Design Introduction 1. 2 Course Outline Introduction to Compiling Lexical Analysis Syntax Analysis –Context Free Grammars –Top-Down Parsing –Bottom-Up.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
 Fall Chart 2  Translators and Compilers  Textbook o Programming Language Processors in Java, Authors: David A. Watts & Deryck F. Brown, 2000,
What is a compiler? –A program that reads a program written in one language (source language) and translates it into an equivalent program in another language.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
CSC 4181 Compiler Construction
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
Objective of the course Understanding the fundamentals of the compilation technique Assist you in writing you own compiler (or any part of compiler)
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Advanced Computer Systems
Compiler Design (40-414) Main Text Book:
Introduction to Compiler Construction
System Software Unit-1 (Language Processors) A TOY Compiler
A Simple Syntax-Directed Translator
Overview of Compilation The Compiler Front End
Overview of Compilation The Compiler Front End
PROGRAMMING LANGUAGES
-by Nisarg Vasavada (Compiled*)
Compiler Lecture 1 CS510.
Course supervisor: Lubna Siddiqui
Compilers B V Sai Aravind (11CS10008).
R.Rajkumar Asst.Professor CSE
CS 3304 Comparative Languages
Lecture 4: Lexical Analysis & Chomsky Hierarchy
High-Level Programming Language
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 10: Compilers and Language Translation
Presentation transcript:

Compilers and Language Translation Gordon College

What’s a compiler? All computers only understand machine language Therefore, high-level language instructions must be translated into machine language prior to execution This is a program 10000010010110100100101……

What’s a compiler? Compiler A piece of system software that translates high-level languages into machine language program.c while (c!='x') { if (c == 'a' || c == 'e' || c == 'i') printf("Congrats!"); else if (c!='x') printf("You Loser!"); } Congrats! prog Compiler 10000010010110100100101…… gcc -o prog program.c

Assembler (a kind of compiler) LOAD X Assembly (opcode table) (symbol table) 0101 0000 0000 1001 Machine Language One-to-one translation

Compiler (high-level language translator) a = b + c - d; 0101 00001110001 LOAD B 0111 00001110010 ADD C 0110 00001110011 SUBTRACT D 0100 00001110100 STORE A 0101 00001110001 0111 00001110010……. One-to-many translation

Goals of a compiler Code produced must be correct A = (B+C)-(D+E); Possible translation: LOAD B ADD C STORE B LOAD D ADD E STORE D SUBTRACT D STORE A Is this correct? No - STORE B and STORE D changes the values of variables B and D which is the high-level language does not intend

Goals of a compiler Code produced should be reasonably efficient and concise Compute the sum - 2x1+ 2x2+ 2x3+ 2x4+…. 2x50000 sum = 0.0 for(i=0;i<50000;i++) { sum = sum + (2.0 * x[i]); Optimizing compiler: sum = 0.0 for(i=0;i<50000;i++) { sum = sum + x[i]; sum = sum * 2.0; 49,999 less instructions

General Structure of a Compiler

The Compilation Process Phase I: Lexical analysis Compiler examines the individual characters in the source program and groups them into syntactical units called tokens Phase II: Parsing The sequence of tokens formed by the scanner is checked to see whether it is syntactically correct Scanner Source code Groups of tokens Parser correct Groups of tokens not correct

The Compilation Process Phase III: Semantic analysis and code generation The compiler analyzes the meaning of the high-level language statement and generates the machine language instructions to carry out these actions Code Generator Groups of tokens Machine language

The Compilation Process Phase IV: Code optimization The compiler takes the generated code and sees whether it can be made more efficient Code Optimizer Machine language Machine language

Overall Execution Sequence on a High-Level Language Program

The Compilation Process Source program Original high-level language program Object program Machine language translation of the source program

Phase I: Lexical Analysis Lexical analyzer The program that performs lexical analysis More commonly called a scanner Job of lexical analyzer Group input characters into tokens Tokens: Syntactical units that are treated as single, indivisible entities for the purposes of translation Classify tokens according to their type

Phase I: Lexical Analysis Program statement sum = sum + a[i]; Digital perspective: tab,s,u,m,blank,=,blank,s,u,m,blank,+,blank,a,[,i,],; Tokenized: sum,=,sum,+,a[i],;

Phase I: Lexical Analysis Typical Token Classifications TOKEN TYPE CLASSIFICATION NUMBER Symbol 1 Number 2 = 3 + 4 - 5 ; 6 == 7 If 8 Else 9 ( 10 ) 11 [ 12 ] 13 …

Phase I: Lexical Analysis Lexical Analysis Process 1. Discard blanks, tabs, etc. - look for beginning of token. 2. Put characters together 3. Repeat step 2 until end of token 4. Classify and save token 5. Repeat steps 1-4 until end of statement 6. Repeat steps 1-5 until end of source code sum 1 = 3 + 4 a 1 [ 12 i 1 ] 13 ; 6 Scanner sum=sum+a[i];

Phase I: Lexical Analysis Input to a scanner - A high-level language statement from the source program Scanner’s output - A list of all the tokens in that statement - The classification number of each token found sum 1 = 3 + 4 a 1 [ 12 i 1 ] 13 ; 6 Scanner sum=sum+a[i];

Phase II: Parsing Parsing phase A compiler determines whether the tokens recognized by the scanner are a syntactically legal statement Performed by a parser

Phase II: Parsing Output of a parser A parse tree, if such a tree exists An error message, if a parse tree cannot be constructed Successful construction of a parse tree is proof that the statement is correctly formed

Example High-level language statement: a = b + c

Grammars, Languages, and BNF Syntax The grammatical structure of the language The parser must be given the syntax of the language BNF (Backus-Naur Form) Most widely used notation for representing the syntax of a programming language literal_expression ::= integer_literal | float_literal | string | character

Grammars, Languages, and BNF In BNF The syntax of a language is specified as a set of rules (also called productions) A grammar The entire collection of rules for a language Structure of an individual BNF rule left-hand side ::= “definition”

Grammars, Languages, and BNF BNF rules use two types of objects on the right-hand side of a production Terminals The actual tokens of the language Never appear on the left-hand side of a BNF rule Nonterminals Intermediate grammatical categories used to help explain and organize the language Must appear on the left-hand side of one or more rules

Grammars, Languages, and BNF Goal symbol The highest-level nonterminal The nonterminal object that the parser is trying to produce as it builds the parse tree All nonterminals are written inside angle brackets Java BNF

BNF Example <postal-address> ::= <name-part> <street-address> <zip-part> <name-part> ::= <personal-part> <last-name> <opt-jr-part> <EOL> | <personal-part> <name-part> <personal-part> ::= <first-name> | <initial> "." <street-address> ::= <opt-apt-num> <house-num> <street-name> <EOL> <zip-part> ::= <town-name> "," <state-code> <ZIP-code> <EOL> <opt-jr-part> ::= "Sr." | "Jr." | <roman-numeral> | "" Identify the following: Goal symbol, terminals, nonterminals, a individual rule Is this a legal postal address? Steve Moses Sr. 215 Rose Ave. Everywhere, NC 43563

Parsing Concepts and Techniques Fundamental rule of parsing: By repeated applications of the rules of the grammar- If the parser can convert the sequence of input tokens into the goal symbol the sequence of tokens is a syntactically valid statement of the language else the sequence of tokens is not a syntactically valid statement of the language

Parsing Example Is the following http address legal: http://www.csm.astate.edu/~rossa/cs3543/bnf.html <httpaddress> ::= http:// <hostport> [ / <path> ] [ ? <search> ] <hostport> ::= <host> [ : <port> ] <host> ::= <hostname> | <hostnumber> <hostname> ::= <ialpha> [ . <hostname> ] <hostnumber> ::= <digits> . <digits> . <digits> . <digits> <port> ::= <digits> <path> ::= <void> | <xpalphas> [ / <path> ] <search> ::= <xalphas> [ + <search> ] <xalpha> ::= <alpha> | <digit> | <safe> | <extra> | <escape> <xalphas> ::= <xalpha> [ <xalphas> ] <xpalpha> ::= <xalpha> | + <xpalphas> ::= <xpalpha> [ <xpalpha> ] <ialpha> ::= <alpha> [ <xalphas> ] <alpha> ::= a | b | … | z | A | B | … | Z <digit> ::= 0 |1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 <safe> ::= $ | - | _ | @ | . | & | ~ <extra> ::= ! | * | " | ' | ( | ) | : | ; | , | <space> <escape> ::= % <hex> <hex> <hex> ::= <digit> | a | b | c | d | e | f | A | B | C | D | E | F <digits> ::= <digit> [ <digits> ] <void> ::=

Parsing Concepts and Techniques Look-ahead parsing algorithms - intelligent parsers One of the biggest problems in building a compiler is designing a grammar that: Includes every valid statement that we want to be in the language Excludes every invalid statement that we do not want to be in the language

Parsing Concepts and Techniques Another problem in constructing a compiler: Designing a grammar that is not ambiguous An ambiguous grammar allows the construction of two or more distinct parse trees for the same statement NOT GOOD - multiple interpretations

Phase III: Semantics and Code Generation Semantic analysis The compiler makes a first pass over the parse tree to determine whether all branches of the tree are semantically valid If they are valid the compiler can generate machine language instructions else there is a semantic error; machine language instructions are not generated

Phase III: Semantics and Code Generation Semantic analysis Syntactically correct, but semantically incorrect example: sum = a + b; int a; double sum; data type mismatch char b; Semantic records a integer sum double b char

Phase III: Semantics and Code Generation Semantic analysis Parse tree b char a integer <expression> + <expression> Semantic record Semantic record <expression> temp ? Semantic record

Phase III: Semantics and Code Generation Semantic analysis Parse tree b integer a integer <expression> + <expression> Semantic record Semantic record <expression> temp integer Semantic record

Phase III: Semantics and Code Generation Compiler makes a second pass over the parse tree to produce the translated code

Phase IV: Code Optimization Two types of optimization Local Global Local optimization The compiler looks at a very small block of instructions and tries to determine how it can improve the efficiency of this local code block Relatively easy; included as part of most compilers:

Phase IV: Code Optimization Examples of possible local optimizations Constant evaluation x = 1 + 1 ---> x = 2 Strength reduction x = x * 2 ---> x = x + x Eliminating unnecessary operations

Phase IV: Code Optimization Global optimization The compiler looks at large segments of the program to decide how to improve performance Much more difficult; usually omitted from all but the most sophisticated and expensive production-level “optimizing compilers” Optimization cannot make an inefficient algorithm efficient - “only makes an efficient algorithm more efficient”

Summary A compiler is a piece of system software that translates high-level languages into machine language Goals of a compiler: Correctness and the production of efficient and concise code Source program: High-level language program

Summary Object program: The machine language translation of the source program Phases of the compilation process Phase I: Lexical analysis Phase II: Parsing Phase III: Semantic analysis and code generation Phase IV: Code optimization