Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 1. Overview J. H. Wang Sep.15, 2015. Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler.

Similar presentations


Presentation on theme: "Chapter 1. Overview J. H. Wang Sep.15, 2015. Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler."— Presentation transcript:

1 Chapter 1. Overview J. H. Wang Sep.15, 2015

2 Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler Programming Language and Compiler Design Computer Architecture and Compiler Design Compiler Design Considerations Integrated Development Environments

3 Language Processors Translators –Transforming human-oriented programming languages into computer-oriented machine languages

4 History of Compilation Early compilers –1950s: by Grace Hopper –Late 1950s: Fortran Broad applications –Typesetting: TeX, LaTeX –Portable document representation: PostScript –Symbolic and numeric problem solving: Mathematica –VLSI: Verilog, VHDL

5 What Compilers Do Compilers may be distinguished in two ways –By the kind of machine code they generate –By the format of the target code they generate

6 Machine Code Generated by Compilers Pure machine code –Only instructions from a particular instruction set Without dependence on any software (library, OS) –Rare; mostly used in system implementation languages Augmented machine code –Augmented with OS and runtime language support routines I/O, storage allocation, mathematical functions Data transfer, procedure call, and dynamic storage instructions –More often Virtual machine code –Only virtual instructions –Virtual machine Pascal P-code Java bytecodes –Portability, program size reduction

7 Bootstrapping

8 Target Code Formats Assembly or other source formats –Easy to scrutinize –Useful for prototyping programming language designs and cross-compilation Relocatable binary –More efficient and more control over the translation process –External references, local instruction addresses, and data addressed are not bound A linkage step is required Absolute binary –Faster, but limited ability to interface with other code –Useful for exercises and prototyping Compilation costs far exceed execution costs

9 Interpreters

10 Capabilities of interpreters –Programs can be easily modified as execution proceeds Interactive debugging –Dynamic object typing can be easily supported E.g. Lisp and Scheme –Significant degree of machine independence Drawbacks –Direct interpretation of source programs can involve significant overhead

11 Syntax and Semantics Syntax: structure –E.g. context-free grammars (CFGs) a=b+c is legal, but b+c=a is not Semantics: meaning –E.g. a=b+c is illegal if any of the variables are undeclared or if b or c is of type Boolean –Static semantics –Runtime semantics

12 Static Semantics A set of rules that specify which syntactically legal programs are actually valid –E.g.: Identifier declaration, type-compatibility of operators and operands, proper number of parameters in procedure calls Can be specified either formally or informally –E.g.: attribute grammars

13 An Example of Attribute Grammars Production rule: –E -> E+T Augmented production rule: –E result -> E v1 + T v2 if v1.type =numeric and v2.type =numeric then result.type <-numeric else call ERROR() –Verbose and tedious

14 Runtime Semantics To specify what a program computes –Can be specified informally E.g.: program states –a=1: the state component corresponding to a is changed to 1 –Formal approaches Natural semantics: operational model –Given assertions before evaluations of a construct, we can infer assertions that will hold after the construct’s evaluation Axiomatic semantics: relations or predicates that relate program variables –E.g.: var <- exp » var is true after statement execution iff. the predicate obtained by replacing all occurrences of var by exp is true beforehand –Good for deriving proofs of program correctness; but difficult to use Denotational semantics: more mathematical in form –E.g: E[T1+T2]m=E[T1]m+E[T2]m

15 Difficulty in semantics: imprecise language specification –E.g.: (in Java) public static int subr(int b) { if (b != 0) return b+100; } public static int subr(int b) { if (b != 0) return b+100; else if (10*b==0) return 1; } –The problem of deciding whether a particular statement in a program is reachable is undecidable In practice, a trusted reference compiler can serve as a de facto language definition –E.g.: Lisp

16 Organization of a Compiler Analysis Synthesis

17 The Structure of a Compiler Tasks performed by compilers – Analysis of the source program Syntax analysis Semantic analysis – Synthesis of a target program that, when executed, will correctly perform the computations described by the source program Code generator Optimizer

18 The Scanner Reading the input text and grouping individual characters into tokens –Identifiers –Integers –Reserved words –Delimiters What the scanner does –It puts the program into a compact and uniform format –It eliminates unneeded information –It processes compiler control directives –It sometimes enters preliminary information into symbol table –It optionally formats and lists the source program

19 Lexical Analysis (Scanning) [Aho, Lam, Sethi, Ullman] Grouping characters into lexemes Producing tokens –(token-name, attribute-value) E.g. –position = initial + rate * 60 –

20 Regular expressions (Chap. 3) –An effective and powerful approach to describe tokens –As a specification for automatic generation of finite automata that recognizes regular sets Scanner generator

21 The Parser Reading tokens and grouping them into phrases according to the syntax specification such as CFGs –Grammars (Chap. 2 & 4) –Parsing (Chap. 5 & 6) –Parser generator It usually builds an Abstract Syntax Tree (AST) as a concise representation of program structure –(Chap. 2 & 7)

22 Syntax Analysis (Parsing) [Aho, Lam, Sethi, Ullman] Creating a tree-like intermediate representation (e.g. syntax tree) that depicts the grammatical structure of the token streams –E.g. – = + * 60

23 The Type Checker (Semantic Analysis) Checking the static semantics of each AST node –If the construct is semantically correct, the type checker decorates the AST node by adding type information to it –Otherwise, a suitable error message is issued

24 Semantic Analysis [Aho, Lam, Sethi, Ullman] Type checking Type conversions or coercions E.g. – = + * 60 int2float

25 Translator (Program Synthesis) Translating AST nodes into Intermediate Representation (IR) code –E.g. while loops -> two subtrees: expression, body It’s largely dictated by the semantics of the source language In simple, nonoptimizing compilers, the translator may generate target code directly More elaborate compilers such as GCC may first generate a high-level IR and then translate it into a low-level IR

26 Intermediate Code Generation [Aho, Lam, Sethi, Ullman] Generating a low-level intermediate representation –It should be easy to produce –It should be easy to translate into the target machine –E.g. three-address code (in Chap. 6) t1 = int2float(60) t2 = id3 * t1 t3 = id2 + t2 id1 = t3

27 Symbol Tables A mechanism that allows information to be associated with identifiers and shared among compiler phases –Identifier declaration –Identifier use –Type checking

28 Symbol Table Management [Aho, Lam, Sethi, Ullman] To record the variable names and collect information about various attributes of each name –Storage, type, scope –Number and types of arguments, method of argument passing, and the type returned NameType position… initial… rate…

29 The Optimizer Analyzing and transforming the IR code generated by the translator into functionally equivalent but improved code –Complex –Optimizations may be performed in stages Optimization can also be done after code generation –E.g. peephole optimization: a few instructions at a time Multiplications by 1 Additions of 0 Loading a value into register when it’s already in another register Replacing a sequence of instructions by a single instruction with the same effect

30 Code Optimization [Aho, Lam, Sethi, Ullman] Attempts to improve the intermediate code –Better: faster, shorter code, or code that consumes less power –E.g. t1 = id3 * 60.0 id1 = id2 + t1

31 The Code Generator Mapping the IR code generated by the translator into target machine code –Machine-dependent, complex Register allocation Code scheduling Automatic construction of code generators has been actively studied –Matching a low-level IR to target-instruction templates –This makes it easy to retarget a compiler to a new target machine E.g. GCC

32 Code Generation [Aho, Lam, Sethi, Ullman] Mapping intermediate representation of the source program into the target language –Machine code: register/memory location assignments –E.g. LDF R2, id3 MULF R2, R2, #60.0 LDF R1, id2 ADDF R1, R1, R2 STF id1, R1

33 Phases of a Compiler [Aho, Lam, Sethi, Ullman] Syntax Analyzer character stream target machine code Lexical Analyzer Intermediate Code Generator Code Generator token stream syntax tree intermediate representation Symbol Table Semantic Analyzer syntax tree Machine-Independent Code Optimization Machine-Dependent Code Optimization (optional)

34 Compiler Writing Tools Compiler generators (compiler compilers) –Scanner generator –Parser generator –Symbol table manager –Attribute grammar evaluator –Code-generation tools Much of the effort in crafting a compiler lies in writing and debugging the semantic phases –Usually hand-coded

35 Programming Language and Compiler Design Many compiler techniques arise from the need to cope with some programming language construct The state of the art in compiler design also strongly affects programming language design The advantages of a programming language that’s easy to compile: –Easier to learn, read, understand –Have quality compilers on a wide variety of machines –Better code will be generated –Fewer compiler bugs –The compiler will be smaller, cheaper, faster, more reliable, and more widely used –Better diagnostic messages and program development tools

36 Computer Architecture and Compiler Design Compiler designers are responsible for making computing capability available to programmers Problems –Instruction sets for some popular architectures are highly nonuniform –High-level programming language operations are not always easy to support –Essential architectural features such as hardware caches and distributed processors and memory are difficult to present to programmers in an architecturally independent manner –Effective use of a large number of processors has always posed challenges to application developers and compiler writers –For some programming languages, runtime checks for data and program integrity are dropped in favor of gains in execution speed

37 Compiler Design Considerations Debugging (development) compilers –Detailing programmer errors –E.g. CodeCenter –It can often tolerate or repair minor errors (e.g. inserting a missing comma or parenthesis) Optimizing compilers (Chap. 13 & 14) –Producing efficient target code at the cost of increased compiler complexity and increased compilation times –Optimal code, even when theoretically possible, is often infeasible in practice –A variety of transformations might interfere with each other Retargetable compilers (Chap. 11 & 13) –Target architecture can be changed without its machine- independent components having to be rewritten –More difficult to write, but development costs can be shared

38 Integrated Development Environments To integrate program development cycle into a single framework –Editing, compilation, testing, debugging Immediate feedback on syntax and semantic problems Focus on source program Providing easy access to information about the program Many of the techniques in batch compilation can be reformulated into incremental form to support IDEs –Parser, type checker, … In this book, we concentrate on the translation of C, C++, Java

39 End of Chapter 1 Any Questions or Comments?


Download ppt "Chapter 1. Overview J. H. Wang Sep.15, 2015. Outline History of Compilation What Compilers Do Interpreters Syntax and Semantics Organization of a Compiler."

Similar presentations


Ads by Google