Download presentation
Presentation is loading. Please wait.
Published byMeredith May Modified over 8 years ago
1
Chapter 1 Introduction Samuel College of Computer Science & Technology Harbin Engineering University
2
CompilerSamuel2005@126.com2 Compilers Compilers are computer programs that translate one language to another. –Very complex program from 10,000 to 1,000,000 lines of code. Its input is a program written in its source language. It produces an equivalent program written in its target language.
3
CompilerSamuel2005@126.com3 Translation Process This is a book. Step 1: lexical analysis This / is / a / book /. Step 2: syntax analysis This subject is predicate a quantifier book object. end Step 3: semantic analysis This pronoun 这 is copula 是 a numeral 一 book noun 书. period 。 Step 4: This is a book. Step 5: 这是一书。 Step 6: 这是一本书。
4
CompilerSamuel2005@126.com4 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Literal Table Symbol Table Error Handler 1 2 3 4 5 6 Target Code
5
CompilerSamuel2005@126.com5 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Literal Table Symbol Table Error Handler 1
6
CompilerSamuel2005@126.com6 The Scanner Reads the source program (stream of characters). Performs lexical analysis: collects sequences of characters into meaningful units called tokens. Example: a[index] = 4 + 2 a identifier [ left bracket index identifier ] right bracket = assignment 4 number + plus sign 2 number
7
CompilerSamuel2005@126.com7 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Tokens Literal Table Symbol Table Error Handler 2
8
CompilerSamuel2005@126.com8 The Parser Receives the source in form of tokens. Performs syntax analysis – determines the structure of the program – similar to performing grammatical analysis on a sentence in natural language. The result is represented as a parse tree or a syntax tree.
9
CompilerSamuel2005@126.com9 Parse Tree expression assign-expression expression = subscript-expression additive-expression expression [] + identifier a identifier index number 4 number 2 a[index] = 4 + 2
10
CompilerSamuel2005@126.com10 Abstract Syntax Tree An abstract syntax tree is a condensation of the information contained in a parse tree. assign-expression subscript-expression additive-expression identifier a identifier index number 4 number 2 a[index] = 4 + 2
11
CompilerSamuel2005@126.com11 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Tokens Syntax Tree Literal Table Symbol Table Error Handler 3
12
CompilerSamuel2005@126.com12 The Semantic Analyzer The semantics of a program are its “meaning”. The semantics of a program determine its runtime behavior. Most programming languages have features (called static semantics) that can be determined prior to execution. Typical static semantics features –Declarations –Type checking The extra information computed by the semantic analyzer are called attributes. –They are added to the tree as annotations, or “decorations”
13
CompilerSamuel2005@126.com13 Abstract Syntax Tree An abstract syntax tree is a condensation of the information contained in a parse tree. assign-expression subscript-expression additive-expression identifier a identifier index number 4 number 2 a[index] = 4 + 2
14
CompilerSamuel2005@126.com14 assign-expression subscript-expression additive-expression identifier a identifier index number 4 number 2 a[index] = 4 + 2 Abstract Syntax Tree
15
CompilerSamuel2005@126.com15 Annotated Tree assign-expression subscript-expression integer additive-expression integer identifier a array of integer identifier index integer number 4 integer number 2 integer a[index] = 4 + 2
16
CompilerSamuel2005@126.com16 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Tokens Syntax Tree Annotated Tree Literal Table Symbol Table Error Handler 4
17
CompilerSamuel2005@126.com17 The Source Code Optimizer The earliest point at which optimization steps can be performed is just after semantic analysis. There may be possibilities that depend only on the source code. Compilers exhibit a wide variation in the kind of optimization and its placement. The output of the source code optimizer is the intermediate representation (IR) or intermediate code.
18
CompilerSamuel2005@126.com18 Example 4 + 2 can be precomputed by the compiler. –This optimization is known as constant folding. –This optimization can be performed on the annotated syntax tree by collapsing the right hand subtree to its constant value. assign-expression subscript-expression integer additive-expression integer identifier a array of integer identifier index integer number 4 integer number 2 integer
19
CompilerSamuel2005@126.com19 Example 4 + 2 can be precomputed by the compiler. –This optimization is known as constant folding. –This optimization can be performed on the annotated syntax tree by collapsing the right hand subtree to its constant value. assign-expression subscript-expression integer number 6 integer identifier a array of integer identifier index integer
20
CompilerSamuel2005@126.com20 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Tokens Syntax Tree Annotated Tree Intermediate code Literal Table Symbol Table Error Handler 5
21
CompilerSamuel2005@126.com21 The Code Generator The code generator takes the intermediate code or IR and generates code for the target machine. –We will write target code in assembly language form. Most compilers generate object code directly. The properties of the target machine become important. –Use instructions of the target machine. –Data representations: how many bytes or words integer and floating-point data types occupy in memory.
22
CompilerSamuel2005@126.com22 Example &a is the address of a (the base address of the array) *R1 means indirect register addressing We assumed that the machine performs byte addressing. Integers occupy two bytes of memory. MOV R0, index ;; value of index -> R0 MUL R0, 2 ;; double value in R0 MOV R1, &a ;; address of a ->R1 ADD R1, R0 ;; add R0 to R1 MOV *R1, 6 ;; constant 6 -> address in R1 a[index] = 4 + 2 a[index] = 6
23
CompilerSamuel2005@126.com23 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Tokens Syntax Tree Annotated Tree Intermediate code Target Code Literal Table Symbol Table Error Handler 6
24
CompilerSamuel2005@126.com24 The Target Code Optimizer Improvements include –Choosing addressing modes to improve performance. –Replacing slow instructions by faster ones. –Eliminating redundant or unnecessary operations Example: MOV R0, index ;; value of index -> R0 SHL R0 ;; double the value in R0 MOV &a[R0], 6 ;; constant 6 -> address a+R0 MOV R0, index ;; value of index -> R0 MUL R0, 2 ;; double value in R0 MOV R1, &a ;; address of a ->R1 ADD R1, R0 ;; add R0 to R1 MOV *R1, 6 ;; constant 6 -> address in R1 a[index] = 6
25
CompilerSamuel2005@126.com25 Translation Process ScannerParser Semantic Analyzer Source Code Optimizer Code Generator Target Code Optimizer Source Code Target Code Tokens Syntax Tree Annotated Tree Intermediate code Target Code Literal Table Symbol Table Error Handler
26
CompilerSamuel2005@126.com26 Interpreters An interpreter is a language translator like a compiler. The difference: the source program is executed immediately, not after translation is complete. Programming language can be either interpreted or compiled. Interpreted languages: BASIC, LISP, Java Compiled languages: FORTRAN, C, C++. Interpreters share many operations with compilers.
27
CompilerSamuel2005@126.com27 Assemblers An assembler is a translator for the assembly language of a particular computer. Assembly language is a symbolic form of the machine language and it is easy to translate. Sometimes, a compiler will generate assembly language as its target language. Then assembler will finish the translation into object code.
28
CompilerSamuel2005@126.com28 Linkers A linker collects code separately compiled or assembled in different object files into final executable file. Also connects to the code for standard library functions and to resources supplied by OS (memory allocators, I/O devices) A linker was originally one of the principal activities of a compiler.
29
CompilerSamuel2005@126.com29 Loaders In object code the primary memory references are made relative to an undetermined starting location that can be anywhere in memory. Loader will resolve all relocateable addresses to a given starting address. Usually, the loading process is part of OS.
30
CompilerSamuel2005@126.com30 Preprocessors A preprocessor is a separate program that is called by the compiler before the translation begins. Preprocessors can –Delete comments –Include other files –Perform macro substitutions A macro is a shorthand description of a repeated sequence of text
31
CompilerSamuel2005@126.com31 Editors Source programs are written using an editor that produces a standard file (ASCII). Recently, compilers have been bundled with editors and other programs into an interactive development environment (IDE). Such editors may be oriented towards the format of programming language. –Programmer may be informed of errors as the program is written. The compiler can be called from within the editor.
32
CompilerSamuel2005@126.com32 Debuggers A debugger is a program that determines execution errors in a compiled program. –It is also packaged in IDE. The debugger keeps track of the source code information such as line numbers, names of variables and procedures. It can halt execution at breakpoint and provide information on called functions and current values of variables.
33
CompilerSamuel2005@126.com33 Homework 1.2 Given the C assignment a[i+1] = a[i] + 2 draw a parse tree and a syntax tree for the expression, using the similar example as a guide.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.