Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.

Similar presentations


Presentation on theme: "Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software."— Presentation transcript:

1 Introduction to Compilers

2 Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software engineering

3 Compilers A compiler is a program that reads a program written in one language – the source language – and translate it into an equivalent program in another language - the target language. Early compilers - 1950’s

4 Machine Language, Assembly Language, High-Level Language Machine language is the native language of the computer on which the program is run. Native code It consists of bit strings which are interpreted by the mechanism inside the computer. Example in IBM 370 Binary: 0001100000110101 Hexadecimal: 1835 Copy the content of Register 5 into Register 3 LR 3, 5 Assembler, assembly language

5 Machine Language, Assembly Language, High-Level Language Example High-level language X := Y + Z; Assembly language L 3, Y ; Load the working register with Y A 3, Z ; Add Z ST 3, X ; Store the result in X

6 Terminology Source language Java, C, C++ Object language Machine language Object code Object file, object module Target machine The computer on which the program is to be run

7 Terminology Cross compiler A compiler that generates code for a machine that is different from the machine on which the compiler runs. Example: A compiler which can be run on a IBM PC but which compiles to the machine language of a special- purpose embedded system.

8 Compilers and Interpreters Compiler Translates the high-level program to the target program. Interpreter Executes the program.

9 The Environment of the Compiler

10 Example COMP myprog ; Compiles the program LINK myprog ; Links the program RUN myprog ; Runs the program

11 Phases of a Compiler Five (six) phases of compilation Lexical analysis Syntactic analysis (Semantic analysis) Intermediate code generation Optimization Object code generation

12 Phases of a Compiler Language Processing System

13 Phases of a Compiler

14 Two Parts of Compilation Analysis breaks up the source program creates an intermediate representation Synthesis constructs the desired target program from the intermediate representation

15 Analysis Lexical Analysis linear analysis, scanning Syntax Analysis parsing, hierarchical analysis Semantic Analysis Intermediate Code Generation Advantage of dividing analysis simple design compiler efficiency compiler portability

16 Analysis of the Source Program Lexical analysis (linear analysis) the streams of characters making up the source program is read from left-to-right and grouped into tokens Syntax analysis (hierarchical analysis) characters or tokens are grouped hierarchically into nested collections with collective meaning Semantic analysis certain checks are performed to ensure that the components of a program fit together meaningfully

17 Lexical Analysis Linear analysis, scanning Reads the stream of characters in the source program from left to right, and groups into tokens Tokens are sequences of characters having a collective meaning

18 Example: Lexical Analysis position := initial + rate * 60 id1 := id2 + id3 * 60

19 Syntax Analysis Parsing Hierarchical analysis Groups the tokens of the source program into grammatical phrases represented by parse tree that are used by the compiler to synthesize output.

20 Example: Syntax Analysis

21 Semantic Analysis Checks the source program for semantic errors and gathers type or semantic information for the subsequent code generation phase

22 Example: Semantic Analysis

23 Error Handler When each phases of compilation encounters error, a phase must somehow deal with that error. Error in Lexical Phase The characters in the input do not form any token of the language. Error in Syntax Phase The token stream violates the structure rules (syntax) of the language. Error in Semantic Phase Constructs have the right syntactic structure, but no meaning to the operation involved.

24 S/W Tools Performing Analysis Structure editors a sequence of command => a source program Pretty printers indentation, fonts Static checkers a program => discover bugs without run Interpreters performing operations

25 Performing Analysis Text formatters typeset text Silicon compilers circuit design Query interpreters DB

26 Intermediate Code Generation Explicit intermediate representation A program for an abstract machine Two properties of intermediate code Easy to produce Easy to translate into the target program Intermediate form Three address form (quadruples, triples) Two address form

27 Three-Address Code Has at most three operands Each three-address instruction has at most one operator in addition to the assignment The compiler must generate a temporary name to hold the value computed by each instruction May have fewer than 3 operands

28 Example: Three-Address Code

29 Example: Intermediate Code

30 Synthesis Part The synthesis part constructs the desired target program from the intermediate representation.

31 Code Optimization Improve the intermediate code to get the fast- running machine code Optimizing compiler

32 Example: Code Optimization

33 Code Generation Generates the target codes re-locatable machine code assembly code

34 Example: Code Generation

35 System Support There is a certain amount of supporting code to be supplied to the compilation. Symbol table management Error handling

36 System Support Symbol table handler The central repository of information about the names or identifiers in the program Error handling Implements the compiler’s response to errors in the code it is compiling. Diagnostics Where the error was found and what kind of error it was

37 Passes, Front End, Back End The compiler makes one or more passes through the program. A pass consists of reading a version of the program from a file and writing a new version of it to an output file. A pass normally comprises more than one phase, but the number of passes, and the phases they cover, varies.

38 Passes, Front End, Back End Front End Dependent on the source language and have little or no concern with the target machine Lexical analysis (Semantic analysis) Intermediate code generation Back End Machine-dependent Code optimization Target code generation

39 Writing a Compiler The first compiler was written in assembly language; there was no other alternative. High-level language compilers Cross compiler Useful tools – “compiler compilers” Lex Yacc

40 Retargetable Compilers In many cases, a compiler writer will want to adapt a compiler for use with a new target A compiler that can be modified in this way is said to be retargetable. Cross compiler Alternative approaches Distinction between Front End and Back End Compiler for imaginary machine (virtual machine)

41 Cousins of the Compiler Preprocessors Assemblers Loaders and Link-Editors

42 Preprocessor A preprocessor is a simple translator that is applied to the source program before it is submitted to the compiler. Before the program is compiled it is passed through the preprocessor, which replaces all occurrences of the pre-defined expression with the defined sequence of instructions. Example

43 Functions of Preprocessors Macro processing File inclusion Language extensions DB query languages embedded in highlevel languages

44 Example: Preprocessors The C Programming Language

45 Assemblers Two-pass assembly Loaders and Link-Editors

46 Assemblers Assembly code a mnemonic version of machine code, in which names are used instead of binary codes for operations, and names are also given to memory addresses

47 Two-Pass Assembly(1) In the first pass, All the identifiers that denote storage locations are found and stored in a symbol table Identifiers are assigned storage locations as they are encountered for the first time Example: b := a + 2

48 Two-Pass Assembly (2) In the second pass, The assembler scans the input again. It translates each operation code into the sequence of bits representing that operation in machine language It translates each identifier representing a location into the address given for that identifier in the symbol table The output of the second pass is usually relocatable machine code

49 Example: Relocatable Addresses Altering the relocatable address to absolute or unrelocatable machine code: Suppose that the address space containing the data is to be loaded starting at location “L =00001111” L must be added to the address of the instruction

50 Loaders and Link-Editors Loader Performs the two functions of loading and link-editing Loading Consists of taking relocatable machine code, altering the relocatable addresses, and placing the altered instructions and data in memory at the proper locations

51 Link-Editors Link-editor Allows us to make a single program from several files of relocatable machine code. These files may have been the result of several different compilations, and One or more may be library files. External references In which the code of one file refers to a location in another file.

52 Summary A quick overall picture of what a compiler does, what goes into it, and how it is organized


Download ppt "Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software."

Similar presentations


Ads by Google