Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yu-Chen Kuo1 Chapter 1 Introduction to Compiling.

Similar presentations


Presentation on theme: "Yu-Chen Kuo1 Chapter 1 Introduction to Compiling."— Presentation transcript:

1 Yu-Chen Kuo1 Chapter 1 Introduction to Compiling

2 Yu-Chen Kuo2 1.1 Compilers

3 Yu-Chen Kuo3 Source languages: Fortran, Pascal, C, etc. Target languages: another PL, machine Lang Compilers: –Single-pass –Multi-pass –Load-and-Go –Debugging –Optimizing

4 Yu-Chen Kuo4 Analysis-Synthesis Model Compilation: Analysis & Synthesis Analysis: –Break source program into pieces –Intermediate representation –Hierarchical structure: syntax tree Node: operation Leaf: arguments Synthesis: construct target program from tree

5 Yu-Chen Kuo5 Analysis-Synthesis Model

6 Yu-Chen Kuo6 Context of a Compiler Several other programs to create.exe files –Preprocessor: macros –Assembler: translate assembly into machine code –Loader/link-editor: link library routines

7 Yu-Chen Kuo7 Context of a Compiler

8 Yu-Chen Kuo8 1.2 Analysis of the source program Three phases 1.Linear analysis Divide source program into tokens 2.Hierarchical analysis Tokens grouped hierarchically 3.Semantic analysis Ensure components fit meaningfully

9 Yu-Chen Kuo9 Lexical Analysis Linear analysis: lexical analysis, scanning  e.g., position:= initial+rate*60 1.Identifier position 2.Assignment symbol “: =“ 3.Identifier initial 4.“+” sign 5.Identifier rate 6.“*” sign 7.number 60

10 Yu-Chen Kuo10 Syntax Analysis Hierarchical analysis: parsing or syntax analysis –Group tokens into grammatical phrases  Grammatical phrases: parser tree

11 Yu-Chen Kuo11 Syntax Analysis

12 Yu-Chen Kuo12 Syntax Analysis Hierarchical structure is expressed by recursive rules Recursively define expression 1.identifier is an expression 2.number is an expression 3.expression1 +/  expression2 (expression1) are an expression By rule 1, initial and rate are exp. By rule 2, 60 is an exp. By rule 3, initial+rate*60 is an exp.

13 Yu-Chen Kuo13 Syntax Analysis Recursively define statement 1. identifier1:= expression2 is a statement 2. while (expression1) do statement2 If (expression1) then statement2 are statements

14 Yu-Chen Kuo14 Lexical v.s. Syntax Analysis Division is arbitrary Recursion or not –recognize identifiers, by linear scan until neither a letter or a digital was found, no recursion E.g., initial –Not powerful enough to analyze exp. or statement, without putting hierarchical structure E.g, ( …..), begin …. end, statements

15 Yu-Chen Kuo15 Lexical v.s. Syntax Analysis Division is arbitrary Recursion or not –recognize identifiers, by linear scan until neither a letter or a digital was found, no recursion E.g., initial –Not powerful enough to analyze exp. or statement, without putting hierarchical structure E.g, ( …..), begin …. end, statements

16 Yu-Chen Kuo16 Semantic Analysis Check semantic error Gather type information for code-generation Using hierarchical structure to identify operators and operands Doing type checking –E.g, using a real number to index an array (error) –Type convert –E.g, Fig.1.5 ittoreal(60) if initial is a real number

17 Yu-Chen Kuo17 Semantic Analysis

18 Yu-Chen Kuo18 Analysis in Text Formatters \hbox { } \hbox {\vbox{! 1} \vbox{@ 2}}

19 Yu-Chen Kuo19 1.3 The Phases of A Compiler

20 Yu-Chen Kuo20 1.3 The Phases of A Compiler Phases First three phases: analysis portion Last three phases: synthesis portion Symbol-table management phase Error handler phases

21 Yu-Chen Kuo21 Symbol-table Management To record the identifiers in source program –Identifier is detected by lexical analysis and then is stored in symbol table To collect the attributes of identifiers (not by lexical analysis) –Storage allocation : memory address –Types –Scope (where it is valid, local or global) –Arguments (in case of procedure names) Arguments numbers and types Call by reference or address Return types

22 Yu-Chen Kuo22 Symbol-table Management Semantic analysis uses type information check the type consistence of identifiers Code generating uses storage allocation information to generate proper relocation address code

23 Yu-Chen Kuo23 Error Detection and Reporting Syntax and semantic analysis handle a large fraction of errors Lexical phase: could not form any token Syntax phase: tokens violate structure rules Semantic phase: no meaning of operations –Add an array name and a procedure name

24 Yu-Chen Kuo24 Translation of A Statement

25 Yu-Chen Kuo25 Translation of A Statement

26 Yu-Chen Kuo26 The Analysis Phases Lexical analysis –Group characters into tokens Identifiers Keywords ( if, while ) Punctuations ( ‘(‘,’)’) Multi-character operator (‘:=‘) –Enter lexical value (lexeme) into symbol table position, rate, initial Syntax analysis –Fig. 1.11(a), 1.11(b)

27 Yu-Chen Kuo27 The Analysis Phases Syntax analysis Semantic analysis –Type checking and converting

28 Yu-Chen Kuo28 Intermediate Code Generation Represent the source program for an abstract machine code Should be easy to produce Should be easy to translate into target program Three-address code (at most three operands) –temp2:=id3*temp1 –every memory location can act like a register temp2  BX

29 Yu-Chen Kuo29 Code Optimization Improve the intermediate code Faster-running machine code –temp1 :=id3*60.0 id1:=id2+temp1

30 Yu-Chen Kuo30 Code Generation Generate relocation machine code or assembly code –MOVF id3, R2 MULF#60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1

31 Yu-Chen Kuo31 1.4 Cousins of The Compiler Preprocessors Assemblers Two-Pass Assembler Loaders and Link-Editors

32 Yu-Chen Kuo32 Preprocessors Macro processing File inclusion –#include replace by file “global.h” Rational preprocessors Language extensions –## query language embedded in C –Translated into procedure call

33 Yu-Chen Kuo33 Preprocessors Example 1.2 –\define\JACM #1; #2; #3 {{\s1 J. ACM} {\bf #1}: #2, pp. #3.} –\JACM 17;4;715-728 J. ACM 17:4, pp. 715-728.

34 Yu-Chen Kuo34 Assembler Producing relocatable machine code –DW a #10 DW b #20 MOV a, R1 ADD #2, R1 MOV R1, b Load content of address a into R1 Add constant 2 Store R1 into address b

35 Yu-Chen Kuo35 Two-Pass Assembly First pass –Find all identifiers and their storage location and store in symbol table Identifier Address a 0 b 4 Second pass –Translate each operation code into the sequence of bits –Relocatable machine code

36 Yu-Chen Kuo36 Two-Pass Assembly Example 1.3 Inst. Code Register Mem/Const. Content (R) 0001(MOV) 01(R1) 00(Mem) 00000000(a) * 0011(ADD) 01(R1) 10(Constant) 00000010 0010(MOV) 01(R1) 00(Mem) 00000100(b) *

37 Yu-Chen Kuo37 Two-Pass Assembly ‘*’ denotes relocation bit –if data is loaded starting at address 00001111 –a should be at location 00001111+00000000 –b should be at location 00001111+00000100 Inst. Code Register Mem/Const. Content (R) 0001(MOV) 01(R1) 00(Mem) 00000111(a) * 0011(ADD) 01(R1) 10(Constant) 00000010 0010(MOV) 01(R1) 00(Mem) 00010011(b) *

38 Yu-Chen Kuo38 Loaders and Link-Editors Loader –Taking and altering relocatable address machine codes Link-editors –External references Library file, routines by system, any other program


Download ppt "Yu-Chen Kuo1 Chapter 1 Introduction to Compiling."

Similar presentations


Ads by Google