Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis.

Similar presentations


Presentation on theme: "CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis."— Presentation transcript:

1 CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis

2 AST: Abstract Syntax Tree

3 CS 404Ahmed Ezzat 3 Semantic Analysis Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known, i.e., try to understand the “meaning” of the program. Try to verify a program “make sense.” After parsing (and constructing AST)  next is semantics analysis & intermediate code generation In typed languages as C, semantic analysis involves adding information to the symbol table and performing type checking.

4 CS 404Ahmed Ezzat 4 Semantic Analysis The information to be computed is beyond the capabilities of standard parsing techniques, therefore it is not regarded as syntax. As for Lexical and Syntax analysis, also for Semantic Analysis we need both a Representation Formalism and an Implementation Mechanism. As representation formalism this lecture illustrates what are called Syntax Directed Translations.

5 CS 404Ahmed Ezzat 5 Semantic Analysis Goals Compiler must do more than recognize whether a sentence belongs to the language… Find remaining errors that would make program invalid  undefined variables, types  type errors that can be caught statically Figure out useful information for later phases  types of all expressions  data layout Terminology  Static checks – done by the compiler  Dynamic checks – done at run time

6 CS 404Ahmed Ezzat 6 Semantic Analysis: Kind of Checks Uniqueness checks  Certain names must be unique  Many languages require variable declarations Flow-of-control checks  Match control-flow operators with structures  Example: break applies to innermost loop/switch Type checks  Check compatibility of operators and operands Logical checks  Program is syntactically and semantically correct, but does not do the “correct” thing

7 CS 404Ahmed Ezzat 7 Semantic Analysis: Examples of Reported Errors Undeclared identifier Multiply declared identifier Index out of bounds Wrong number or types of args to call Incompatible types for operation Break statement outside switch/loop Goto with no label

8 CS 404Ahmed Ezzat 8 Semantic Analysis: Examples of Reported Errors How do these checks help compilers?  Allocate right amount of space for variables  Select right machine operations  Proper implementation of control structures Try compiling this code: void main() { int i=21, j=42; printf(“Hello World\n”); printf(“Hello World, N=%d\n”); printf(“Hello World\n”, i, j); printf(“Hello World, I=%d, J=%d\n”, i, j); }

9 CS 404Ahmed Ezzat 9 Semantic Analysis: Typical Semantics Errors Multiple declarations: a variable should be declared (in the same scope) at most once Undeclared variable: a variable should not be used before being declared Type mismatch: type of the LHS of an assignment should match the type of the RHS Wrong arguments: methods should be called with the right number and types of arguments

10 CS 404Ahmed Ezzat 10 Semantic Analysis: Type Checking and Code Generation You can type check and generate code as part of semantics actions:  Difficult to read/maintain  Compiler must analyze the program the same order parsed Instead, we split these 2 tasks (traversing the AST created by parser): 1. For each scope in the program  process the declarations  add new entries to the symbol table and  report any variables that are multiply declared  process the statements  find uses of undeclared variables, and  update the "ID" nodes of the AST to point to the appropriate symbol-table entry. 2. Process all of the statements in the program again  use the symbol-table information to determine the type of each expression, and to find type errors.

11 CS 404Ahmed Ezzat 11 Semantic Analysis: Examples I Check uniqueness of name declarations:  E.g., int f=2; int f() {return 1;} is not allowed in Pascal, but may be allowed in other languages Type checking  E,g, int i; i = 100; okay i = “abc”; not okay

12 CS 404Ahmed Ezzat 12 Semantic Analysis: Examples II Expression well-formed  7 + 3.14  Okay in C, not okay in some more strict languages  6 + “abc”  Not okay in C, okay in some other languages Same name must appear two or more times  Example: defining a block:  Begin B  End B

13 CS 404Ahmed Ezzat 13 Semantic Analysis: Examples III Function calls  Types of arguments match definitions  Number of arguments match definitions  Return type match definitions Flow-of-control checks  “break” only within loop  “return” only within function  “default” or “error” statement for “switch” in C

14 CS 404Ahmed Ezzat 14 Static vs. Dynamic Checking Semantic checking during compile time is called “static checking” Semantic checking during execution time is called “dynamic checking” The more is done at compile time, the fewer opportunities of error at run time

15 CS 404Ahmed Ezzat 15 Semantic Specifications Checking against what? Each language has a specification (spec) Specs gives the expected types (or other values) of a construct in its context:  C specs used to be “loose”, so it left rooms for different implementations  Java specs are very thorough

16 CS 404Ahmed Ezzat 16 How to Check? Together with parsing  Syntax Directed Translation (SDT) is one technique for implementing semantic analysis A separate pass  Between parsing and code generation  Syntax tree  checker  syntax tree

17 CS 404Ahmed Ezzat 17 Type Checking In most programming languages, there are basic types and “constructed/user-defined types,” e.g., struct in C-language Basic: atomic types  E.g., boolean, integer, string Constructed: from basic types  E.g., arrays, records, sets Not clearly defined – pointers can be either type depending on how implemented

18 CS 404Ahmed Ezzat 18 Type Expressions A basic type is a type expression A type name is a type expression Construct type expressions from other type expressions  Arrays: array(I,T), contains I elements of type T  Products: cartesian product T1XT2  Records or Structs: record(t), struct(t)  Pointers: pointer(t)

19 CS 404Ahmed Ezzat 19 Common Type Expressions (cont.) Function types: TD  TR Domain is the set of possible input values of the function. The type of domain is TD. Range is the set of possible output values of the function. The type of range is TR.  Example: Z = X mod Y  the type of mod is (int x int)  int

20 CS 404Ahmed Ezzat 20 Type System A type expression can be represented using a graph A type system is a collection of rules to assign type expressions Type checking: verify that operands have correct types

21 CS 404Ahmed Ezzat 21 Recovery from Type Errors Report errors Type coercion/casting: implicit change of types: e.g., int i, j; float f; i = j + f;  In C, f will be “coerced” to type int by truncate(f), and will not generate an error  In another language this may generate an error  Suggest to use explicit conversion i = (int) f + j;

22 CS 404Ahmed Ezzat 22 A Simple Type Checker Use syntax directed translation (SDT) method we talked about Types are defined as synthesized attributes Handles arrays, pointers, statements, functions, plus basic types char and int

23 CS 404Ahmed Ezzat 23 Equivalence of Types Structural equivalence: same base type, same constructions (maybe different name) Name equivalence:  Same type name  equivalent  No type name  equivalent only if declared together

24 CS 404Ahmed Ezzat 24 Overloading and Polymorphism Overloading: A symbol has different meaning depending on its context Polymorphism: a function whose arguments may have different types in different executions

25 CS 404Ahmed Ezzat 25 Attribute Grammars Both semantic analysis and intermediate code generation can be described in terms of annotation, or “decoration” of a sparse or syntax tree Attribute Grammars provide a formal framework for decorating such a tree Attribute Grammars is associated to Action Routines. Let us start with decoration of parse tree then consider syntax tree.

26 CS 404Ahmed Ezzat 26 Syntax-Directed Translation (SDT) Syntax Directed Translation (SDT) Syntax Directed Definition (SDD) Implementing SDD:  Dependency Graph  S-Attributed Definition  L-Attributed Definition Translation Schemes

27 CS 404Ahmed Ezzat 27 The Principle of Syntax Directed Translation states that the meaning of an input sentence is related to its syntactic structure, i.e., to its Parse-Tree. By Syntax Directed Translations we indicate those formalisms for specifying translations for programming language constructs guided by context-free grammars. – We associate Attributes to the grammar symbols representing the language constructs. – Values for attributes are computed by Semantic Rules associated with grammar productions. Syntax-Directed Translation (SDT): Overview

28 CS 404Ahmed Ezzat 28 Evaluation of Semantic Rules may: – Generate Code; – Insert information into the Symbol Table; – Perform Semantic Check; – Issue error messages; – etc. There are two notations for attaching semantic rules: 1. Syntax Directed Definitions. High-level specification hiding many implementation details (also called Attribute Grammars). 2. Translation Schemes. More implementation oriented: Indicate the order in which semantic rules are to be evaluated. Syntax-Directed Translation (SDT): Overview

29 CS 404Ahmed Ezzat 29 Syntax Directed Definition (SDD): Overview Syntax Directed Definitions are a generalization of context-free grammars in which: 1. Grammar symbols have an associated set of Attributes; 2. Productions are associated with Semantic Rules for computing the values of attributes. Such formalism generates Annotated Parse-Trees where each node of the tree is a record with a field for each attribute (e.g.,X.a indicates the attribute a of the grammar symbol X).

30 CS 404Ahmed Ezzat 30 Syntax Directed Definition (SDD): Overview The value of an attribute of a grammar symbol at a given parse-tree node is defined by a semantic rule associated with the production used at that node. We distinguish between two kinds of attributes: 1. Synthesized Attributes. They are computed from the values of the attributes of the children nodes. 2. Inherited Attributes. They are computed from the values of the attributes of both the siblings and the parent nodes.

31 CS 404Ahmed Ezzat 31 The process of evaluating attributes is called annotation, or DECORATION, of the parse tree  When a parse tree under this grammar is fully decorated, the value of the expression will be in the val attribute of the root The code fragments for the rules are called SEMANTIC FUNCTIONS  Strictly speaking, they should be cast as functions, e.g., E1.val = sum (E2.val, T.val). Syntax Directed Definition (SDD): Overview

32 CS 404Ahmed Ezzat 32 Syntax Directed Definition (SDD): Overview - Example Let us consider the Grammar for arithmetic expressions. The Syntax Directed Definition associates to each non terminal a synthesized attribute called val.

33 CS 404Ahmed Ezzat 33 Implementing Syntax Directed Definitions Dependency Graphs S-Attributed Definitions L-Attributed Definitions

34 CS 404Ahmed Ezzat 34 Implementing Syntax Directed Definitions: Dependency Graphs Implementing a Syntax Directed Definition consists primarily in finding an order for the evaluation of attributes  Each attribute value must be available when a computation is performed. Dependency Graphs are the most general technique used to evaluate syntax directed definitions with both synthesized and inherited attributes. A Dependency Graph shows the interdependencies among the attributes of the various nodes of a parse-tree.  There is a node for each attribute;  If attribute b depends on an attribute c there is a link from the node for c to the node for b (b c). Dependency Rule: If an attribute b depends on an attribute c, then we need to fire the semantic rule for c first and then the semantic rule for b.

35 CS 404Ahmed Ezzat 35 Implementing Syntax Directed Definitions Synthesized-Attributes Evaluation Order: Semantic rules in a S-Attributed Definition can be evaluated by a bottom-up, or PostOrder, traversal of the parse-tree. Example: The above arithmetic grammar is an example of an S-Attributed Definition. The annotated parse-tree for the input 3*5+4n is:

36 CS 404Ahmed Ezzat 36 The above attributes are called Synthesized attributes: They are calculated only from the attributes of things below them in the parse tree Another attribute type is called Inherited attribute:  Inherited attributes may depend on things above or to the side of them in the parse tree  Tokens have only synthesized attributes, initialized by the scanner (name of an identifier, value of a constant, etc.).  Inherited attributes of the start symbol constitute run-time parameters of the compiler Implementing Syntax Directed Definitions Synthesized-Attributes

37 CS 404Ahmed Ezzat 37 Inherited Attributes are useful for expressing the dependence of a construct on the context in which it appears. Note: It is always possible to rewrite a syntax directed definition to use only synthesized attributes, but it is often more natural to use both synthesized and inherited attributes. Evaluation Order. Inherited attributes cannot be evaluated by a simple PreOrder traversal of the parse-tree:  Unlike synthesized attributes, the order in which the inherited attributes of the children are computed is important Indeed:  Inherited attributes of the children can depend from both left and right siblings! Implementing Syntax Directed Definitions Inherited-Attributes

38 CS 404Ahmed Ezzat 38 Implementing Syntax-Directed Definition: Inherited + Synthesized-Attributes Example The non terminal T has a synthesized attribute, type, determined by the tokens int/real in the corresponding production. The productionD  TL is associated with the semantic rule L.in := T.type which set the inherited attribute L.in. Note: The production L  L1; id distinguishes the two occurrences of L.

39 CS 404Ahmed Ezzat 39 Implementing Syntax-Directed Definition: Attributes Synthesized attributes can be evaluated by a PostOrder traversal. Inherited attributes that do not depend from right children can be evaluated by a PreOrder traversal. The annotated parse-tree for the input real id1, id2, id3 is: L.in is then inherited top-down the tree by the other L-nodes. At each L-node the procedure addtype inserts into the symbol table the type of the identifier.

40 CS 404Ahmed Ezzat 40 Implementing Syntax-Directed Definition: Summary S-Attributed Grammar (uses only synthesized attributes) is purely bottom-up: SLR(1) but not LL(1) LL(1) Grammar requires inherited attributes L-Attributed (L stands for Left) Grammar contains both synthesized and Inherited attributes but do not need to build a dependency graph to evaluate them.

41 CS 404Ahmed Ezzat 41 Translation Schemes Translation Schemes are more implementation oriented than syntax directed definitions since they indicate the order in which semantic rules and attributes are to be evaluated. Definition. A Translation Scheme is a context-free grammar in which  Attributes are associated with grammar symbols;  Semantic Actions are enclosed between braces { } and are inserted within the right-hand side of productions. Note: Yacc uses Translation Schemes.

42 CS 404Ahmed Ezzat 42 Translation Schemes Translation Schemes deal with both synthesized and inherited attributes. Semantic Actions are treated as terminal symbols: Annotated parse-trees contain semantic actions as children of the node standing for the corresponding production. Translation Schemes are useful to evaluate L-Attributed definitions at parsing time (even if they are a general mechanism).  An L-Attributed Syntax-Directed Definition can be turned into a Translation Scheme.

43 CS 404Ahmed Ezzat 43 Translation Schemes: Example Consider the Translation Scheme for the L- Attributed Definition for “type declarations”:  D  T {L.in := T.type} L  T  int {T.type := integer}  T  real {T.type := real}  L  {L 1.in := L.in} L 1, id {addtype(id.entry, L.in)}  L  id {addtype(id.entry, L.in)}

44 CS 404Ahmed Ezzat 44 Translation Schemes: Example The parse-tree with semantic actions for the input real id1, id2, id3 is: Traversing the Parse-Tree in depth-first order (PostOrder) we can evaluate the attributes.

45 CS 404Ahmed Ezzat 45 Translation Schemes: Summary When designing a Translation Scheme we must be sure that an attribute value is available when a semantic action is executed. When the semantic action involves synthesized attributes: The action can be put at the end of the production.  Example. The following Production and Semantic Rule: T  T1 * F T.val := T1.val * F.val yield the translation scheme: T  T1 * F {T.val := T1.val * F.val}

46 CS 404Ahmed Ezzat 46 Translation Schemes: Summary When the semantic action involves inherited attributes of a grammar symbol: The action must be put before the symbol itself.  Example: The following Production and Semantic Rule: D  T L L.in := T.type yield the translation scheme: D  T {L.in := T.type} L

47 CS 404Ahmed Ezzat 47 END


Download ppt "CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis."

Similar presentations


Ads by Google