Presentation is loading. Please wait.

Presentation is loading. Please wait.

System Programming and administration

Similar presentations


Presentation on theme: "System Programming and administration"— Presentation transcript:

1 System Programming and administration
Lecture 4 Compiler : Overview of compilation process

2 Outline Compiler Functions of compiler Compilation process
Phases of compilation Incremental compiler

3 Compiler A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). Source Target code code Errors Compiler

4 Why ? The most common reason for wanting to transform source code is to create an executable program.

5 Basic functions Scanning: Parsing: (Object) code generation:
Scan the character string of a program, analyze them (according to rules, called lexical rules), then figure out each token. Also called lexical analysis. The part of a compiler for this function is called scanner. Parsing: Pass through the sequence of tokens, parse them (according to rules, called grammars), then figure out each statement, also called syntactic analysis. The part of a compiler for this function is called parser. (Object) code generation: Each statement has its meaning (semantics), for each parsed statement, generate its code according to its meaning. Also called semantic analysis. The part of a compiler for this function is called code-generator.

6 The Analysis-Synthesis Model of Compilation
There are two parts to compilation: Analysis determines the operations implied by the source program which are recorded in a tree structure Synthesis takes the tree structure and translates the operations therein into the target program

7 In graph

8 compiler parts A compiler consists of three main parts: frontend,
middle-end, backend.

9 Frontend Split into two parts
Scanner Parser source tokens IR Frontend checks whether the program is correctly written in terms of the programming language syntax and semantics. legal and illegal programs are recognized. Errors are reported, if any, in a useful way, Type checking is also performed. Frontend generates IR (intermediate representation) for the middle-end. Split into two parts Scanner: Responsible for converting character stream to token stream Also strips out white space, comments Parser: Reads token stream; generates IR

10 Middle-end Middle-end is where the optimizations for performance take place. Typical transformations for optimization are Removal of useless or unreachable code, Discovering and propagating constant values Relocation of computation to a less frequently executed place Middle-end generates IR for the following backend. Most optimization efforts are focused on this part.

11 Backend Backend is responsible for translation of IR into the target assembly code. The target instruction(s) are chosen for each IR instruction. Variables are also selected for the registers. Backend utilizes the hardware by figuring out how to keep parallel FUs busy, filling delay slots, and so on.

12 Analysis consists of 3 phases
Linear Analysis Hierarchical Analysis Semantic Analysis

13 cross compiler A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is run. Cross compiler tools are used to generate executables for embedded system or multiple platforms. It is used to compile for a platform upon which it is not feasible to do the compiling, like microcontrollers that don't support an operating system.

14 incremental compiler The term incremental compiler may refer to two different types of compiler. Imperative programming Interactive Programming

15 incremental compiler In imperative programming and software development, an incremental compiler is one that when invoked, takes only the changes of a known set of source files and updates any corresponding output files (in the compiler's target language, often bytecode) that may already exist from previous compilations. By effectively building upon previously compiled output files, the incremental compiler avoids the wasteful recompilation entire source files, where most of the code remains unchanged. For most incremental compilers, compiling a program with small changes to its source code is usually near instantaneous. It can be said that an incremental compiler reduces the granularity of a language's traditional compilation units while maintaining the language's semantics, such that the compiler can append and replace smaller parts.

16 incremental compiler In the interactive programming paradigm, and particularly in Prolog related literature, an incremental compiler refers to a compiler that is actually a part of the runtime system of the source language. The compiler can be invoked at runtime on some source code or data structure managed by the program, which then produces a new compiled program fragment that is then immediately available for use by the runtime system. This scheme allows for a degree of self-modifying code and requires metaprogramming language features. The ability to add, remove and delete code while running is known as hot swapping.

17 Intermediate Code Generator
Phases of a Compiler Source Program Lexical Analyzer 1 Syntax Analyzer 2 Semantic Analyzer 3 Intermediate Code Generator 4 Code Optimizer 5 Code Generator 6 Target Program Symbol-table Manager Error Handler

18 Phases of Compilation

19 Phases of the Compilation Process
Lexical analysis (scanning): the source text is broken into tokens. Syntactic analysis (parsing): tokens are combined to form syntactic structures, typically represented by a parse tree. The parser may be replaced by a syntax-directed editor, which directly generates a parse tree as a product of editing. Semantic analysis: intermediate code is generated for each syntactic structure. Type checking is performed in this phase. Complicated features such as generic declarations and operator overloading (as in Ada and C++) are also processed. Machine-independent optimization: intermediate code is optimized to improve efficiency. Code generation: intermediate code is translated to relocatable object code for the target machine. Machine-dependent optimization: the machine code is optimized.

20 Lexical Analysis Function
Easiest Analysis, Scanning the program to be compiled and recognizing and Identify tokens that make up the source statements. which are the basic building blocks Position := initial + rate * 60 ; All are tokens Blanks, Line breaks, etc. are scanned out

21 Block schematic

22 Scanner Example Input text Token Stream
// this statement does very little if (x >= y) y = 42; Token Stream Note: tokens are atomic items, not character strings IF LPAREN ID(x) GEQ ID(y) RPAREN ID(y) BECOMES INT(42) SCOLON

23 Parser Example Token Stream Input Abstract Syntax Tree IF LPAREN ID(x)
ifStmt GEQ ID(y) RPAREN >= assign ID(y) BECOMES INT(42) SCOLON ID(x) ID(y) ID(y) INT(42)

24 Examples of Token Tokens: A sequence of characters to be treated as a single unit. Tokens can be keywords, operators, identifiers, integers, floating-point numbers, character strings, etc. Each token is usually represented by some fixed-length code, such as an integer, rather than as a variable-length character string Token type, Token specifier (value) • Examples of tokens. – Reserved words (e.g. begin, end, struct, if etc.) – Keywords (integer, true etc.) – Operators (+, &&, ++ etc) – Identifiers (variable names, procedure names, parameter names) – Literal constants (numeric, string, character constantsetc.) – Punctuation marks (:, , etc.)

25 Syntactic Analysis The source statements written by programmers are recognized as language constructs described by the grammar. Building the parse tree for the statements being translated. Bottom-up and top-down techniques. Bottom-up: building the leave of the tree first which match the statements, and then combining into higher-level nodes until the root is reached. Top-down: beginning from the root, i.e., the rule of the grammar specifying the goal of the analysis, and constructing the tree so that the leave match the statements being analyzed.

26 Code generation Generate object code in the form of machine code directly or assembly language. A basic technique: Associate each rule (or an alternative rule) of the grammar with a routine, which translates the construct into object code according to its meaning/semantics. Called semantic routine or code-generation routine. Possibly generate an intermediate form so that optimization can be done to get more efficient code. Data structures needed: A list, or a queue, first-in-first-out, also a LISTCOUNT variable A stack, first-in-last-out. S(token): specifier of a token, i.e., a pointer to the symbol table or the integer value. LOCCTR: location counter, indicating the next available address.

27 Example Example program: read A read B sum := A + B write sum
Explain the notation

28 Lexical Analysis Tokens: literal = digit digit *
id = letter ( letter | digit ) * [ except "read" and "write" ] literal = digit digit * ":=", "+", "-", "*", "/", "(", ")“ $$$ [end of file]

29 Syntax Analysis Grammar in EBNF
<pgm> > <statement list> $$$ <stmt list> -> <stmt list> <stmt> | E <stmt> > id := <expr> | read <id> | write <expr> <expr> > <term> | <expr> <add op> <term> <term> > <factor | <term> <mult op> <factor <factor> -> ( <expr> ) | id | literal <add op> -> + | - <mult op> -> * | / Precedence

30 Code Generation Intermediate code: read pop A pop B push A push B add
pop sum push sum write push 2 div

31 Code Generation Target code: .data A: .long 0 B: .long 0 sum: .long 0
.text main: jsr read movl d0,d1 movl d1,A jsr read movl d1,B movl A,d1

32 Code Generation movl B,d2 addl d1,d2 movl d1,sum movl sum,d1 movl d1,d0 jsr write movl #2,d2 divsl d1,d2

33 Any questions?????????? What are the phases of compilation and explain them? What is incremental compiler?


Download ppt "System Programming and administration"

Similar presentations


Ads by Google