Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Analysis. - 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses.

Similar presentations


Presentation on theme: "Semantic Analysis. - 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses."— Presentation transcript:

1 Semantic Analysis

2 - 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses are not powerful enough to ensure the correct usage of variables, objects, functions,... v Semantic analysis: Ensure that the program satisfies a set of rules regarding the usage of programming constructs (variables, objects, expressions, statements)

3 - 2 - Class Problem Classify each error as lexical, syntax, semantic, or correct. int a; a = 1.0; int a; b b = a; { int a; a = 1; } { a = 2; } in a; a = 1; int foo(int a) { foo = 3; } 1int x; x = 2; int foo(int a) { a = 3; }

4 - 3 - Categories of Semantic Analysis v Examples of semantic rules »Variables must be defined before being used »A variable should not be defined multiple times »In an assignment stmt, the variable and the expression must have the same type »The test expr. of an if statement must have boolean type v 2 major categories »Semantic rules regarding scopes »Semantic rules regarding types

5 Semantic Analysis I Scope Analysis

6 - 5 - Scope Information v Characterizes the declaration of identifiers and the portions of the program where it is allowed to use each identifier »Example identifiers: variables, functions, objects, labels v Lexical scope: textual region in the program »Examples: Statement block, formal argument list, object body, function or method body, source file, whole program v Scope of an identifier: The lexical scope its declaration refers to

7 - 6 - Variable Scope v Scope of variables in statement blocks: v Scope of global variables: current file v Scope of external variables: whole program { int a;... {int b;... }.... } scope of variable a scope of variable b

8 - 7 - Function Parameter and Label Scope v Scope of formal arguments of functions: v Scope of labels: int foo(int n) {... } void foo() {... goto lab;... lab: i++;... goto lab;... } scope of label lab, Note in Ansi-C all labels have function scope regardless of where they are scope of argument n

9 - 8 - Scope in Class Declaration v Scope of object fields and methods: class A { public: void f() {x=1;}... private: int x;... } scope of variable x and method f

10 - 9 - Semantic Rules for Scopes v Main rules regarding scopes: »Rule 1: Use each identifier only within its scope »Rule 2: Do not declare identifier of the same kind with identical names more than once in the same lexical scope class X { int X; void X(int X) { X:... goto X; } int X(int X) { int X; goto X; { int X; X: X = 1; } Are these legal? If not, identify the illegal portion.

11 - 10 - NAMEKINDTYPEATTRIBUTES foofuncint,int  intextern margint nargintconst tmpvarcharconst Symbol Tables v Semantic checks refer to properties of identifiers in the program – their scope or type v Need an environment to store the information about identifiers = symbol table v Each entry in the symbol table contains: »Name of an identifier »Additional info about identifier: kind, type, constant?

12 - 11 - Scope Information v How to capture the scope information in the symbol table? v Idea: »There is a hierarchy of scopes in the program »Use similar hierarchy of symbol tables »One symbol table for each scope »Each symbol table contains the symbols declared in that lexical scope

13 - 12 - Example int x; void f(int m) { float x, y;... {int i, j;....; } {int x; l:...; } } int g(int n) { char t;... ; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab func f symtab func g symtab

14 - 13 - Identifiers with Same Name v The hierarchical structure of symbol tables automatically solves the problem of resolving name collisions »E.g., identifiers with the same name and overlapping scopes v To find which is the declaration of an identifier that is active at a program point: »Start from the current scope »Go up the hierarchy until you find an identifier with the same name

15 - 14 - Class Problem int x; void f(int m) { float x, y;... {int i, j; x=1; } {int x; l: x=2; } } int g(int n) { char t; x=3; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab Associate each definition of x with its appropriate symbol table entry

16 - 15 - Catching Semantic Errors int x; void f(int m) { float x, y;... {int i, j; x=1; } {int x; l: i=2; } } int g(int n) { char t; x=3; } xvarint ffuncint  void gfuncint  int margint xvarfloat yvarfloat nargint tvarchar ivarint jvarint xvarint llabel Global symtab i=2 Error! undefined variable

17 - 16 - Symbol Table Operations v Two operations: »To build symbol tables, we need to insert new identifiers in the table »In the subsequent stages of the compiler we need to access the information from the table: use lookup function v Cannot build symbol tables during lexical analysis »Hierarchy of scopes encoded in syntax v Build the symbol tables: »While parsing, using the semantic actions »After the AST is constructed

18 - 17 - Forward References v Use of an identifier within the scope of its declaration, but before it is declared v Any compiler phase that uses the information from the symbol table must be performed after the table is constructed v Cannot type-check and build symbol table at the same time v Example class A { int m() {return n(); } int n() {return 1; } }

19 Semantic Analysis II Type Checking

20 - 19 - Type Information v What are types? »They describe the values computed during the execution of the program »Essentially they are a predicate on values  E.g., “int x” in C means –2^31 <= x < 2^31 v Type Information: Describes what kind of values correspond to different constructs: variables, statements, expressions, functions, etc. »variables:int a;integer »expressions:(a+1) == 2boolean »statements:a = 1.0;floating-point »functions:int pow(int n, int m) int = int,int

21 - 20 - Type Checking v Type Errors: improper or inconsistent operations during program execution v Type-safety: absence of type errors v Type Checking: Set of rules which ensures the type consistency of different constructs in the program

22 - 21 - How to Ensure Type-Safety v Bind (assign) types, then check types v Type binding: defines type of constructs in the program (e.g., variables, functions) »Can be either explicit (int x) or implicit (x=1) v Type checking: determine if the program correctly uses the type bindings »Consists of a set of type-checking rules

23 - 22 - Type Checking v Semantic checks to enforce the type safety of the program v Examples »Unary and binary operators (e.g. +, ==, [ ]) must receive operands of the proper type »Functions must be invoked with the right number and type of arguments »Return statements must agree with the return type »In assignments, assigned value must be compatible with type of variable on LHS »Class members accessed appropriately

24 - 23 - 4 Concepts Related to Types/Languages 1. Static vs dynamic checking »When to check types 2. Static vs dynamic typing »When to define types 3. Strong vs weak typing »How many type errors 4. Sound type systems »Statically catch all type errors

25 - 24 - Static vs Dynamic Checking v Static type checking »Perform at compile time v Dynamic type checking »Perform at run time (as the program executes) v Examples of dynamic checking »Array bounds checking »Null pointer dereferences

26 - 25 - Static vs Dynamic Typing v Static and dynamic typing refer to type definitions (i.e., bindings of types to variables, expressions, etc.) v Static typed language »Types defined at compile-time and do not change during the execution of the program  C, C++, Java, Pascal v Dynamically typed language »Types defined at run-time, as program executes  Lisp, Smalltalk

27 - 26 - Strong vs Weak Typing v Refer to how much type consistency is enforced v Strongly typed languages »Guarantee accepted programs are type-safe v Weakly typed languages »Allow programs which contain type errors v These concepts refer to run-time »Can achieve strong typing using either static or dynamic typing

28 - 27 -

29 - 28 - Soundness v Sound type systems: can statically ensure that the program is type-safe v Soundness implies strong typing v Static type safety requires a conservative approximation of the values that may occur during all possible executions »May reject type-safe programs »Need to be expressive: reject as few type-safe programs as possible

30 - 29 - Class Problem Strong TypingWeak Typing Static Typing Dynamic Typing Classify the following languages: C, C++, Pascal, Java, Standard ML, Modula-3, Smalltalk C, C++, Pascal, Java, C#, Modula-3 Standard ML, Smalltalk, Lisp Javascript, PHP, Perl 5, Objective-C

31 - 30 - Why Static Checking? v Efficient code »Dynamic checks slow down the program v Guarantees that all executions will be safe »Dynamic checking gives safety guarantees only for some execution of the program v But is conservative for sound systems »Needs to be expressive: reject few type-safe programs

32 - 31 - Type Systems v What are types? »They describe the values computed during the execution of the program »Essentially they are a predicate on values  E.g., “int x” in C means –2^31 <= x < 2^31 v Type expressions: Describe the possible types in the program »E.g., int, char*, array[], object, etc. v Type system: Defines types for language constructs »E.g., expressions, statements

33 - 32 - Type Expressions v Language type systems have basic types (aka: primitive types or ground types) »E.g., int, char*, double v Build type expressions using basic types: »Type constructors  Array types  Structure/object types  Pointer types »Type aliases »Function types

34 Semantic Analysis III Static Semantics

35 - 34 - Static Semantics v Can describe the types used in a program v How to describe type checking v Static semantics: Formal description for the programming language v Is to type checking: »As grammar is to syntax analysis »As regular expression is to lexical analysis v Static semantics defines types for legal ASTs in the language

36 - 35 - Type Judgments or Relations v Static semantics = formal notation which describes type judgments: »E : T »means “E is a well-typed expression of type T” »E is typable if there is some type T such that E : T v Type judgment examples: »2 : int »true : bool »2 * (3 + 4) : int »“Hello” : string

37 - 36 - Type Judgments for Statements v Statements may be expressions (i.e., represent values) v Use type judgments for statements: »if (b) 2 else 3 : int »x == 10 : bool »b = true, y = 2 : int (result of comma operator is the value of the rightmost expression) v For statements which are not expressions: use a special unit type (void or empty type) »S : unit »means “S is a well-typed statement with no result type”

38 - 37 - Class Problem f1 [ 3 ] i = i1 [ i2] while (i < 10) do S1 (i ? 0) 4.0 : 1.0 Whats the type of the following statements? Assume i* are int variables, f* are float variables

39 - 38 - Deriving a Judgment v Consider the judgment »if (b) 2 else 3 : int v What do we need to decide that this is a well-typed expression of type int? »b must be a bool (b : bool) »2 must be an int (2 : int) »3 must be an int (3 : int)

40 - 39 - Type Judgements v Type judgment notation: A E : T »Means “In the context A, the expression E is a well-typed expression with type T” v Type context is a set of type bindings: id : T »(i.e. type context = symbol table) »b: bool, x: int b: bool »b: bool, x: int if (b) 2 else x : int » 2 + 2 : int    

41 - 40 - Deriving a Judgment v To show »b: bool, x: int if (b) 2 else x : int v Need to show »b: bool, x: int b : bool »b: bool, x: int 2 : int »b: bool, x: int x : int    

42 - 41 - General Rule v For any environment A, expression E, statements S1 and S2, the judgement: »A if (E) S1 else S2 : T v Is true if: »A E : bool »A S1 : T »A S2 : T    

43 - 42 - Inference Rules A E : bool A S1 : T A S2 : T  A if (E) S1 else S2 : T  if-rule premises conclusion Holds for any choice of E, S1, S2, T Read as, “if we have established the statements in the premises listed above the line, then we may derive the conclusion below the line”

44 - 43 - Why Inference Rules? v Inference rules: compact, precise language for specifying static semantics v Inference rules correspond directly to recursive AST traversal that implements them v Type checking is the attempt to prove type judgments A E : T true by walking backward through the rules 

45 - 44 - Meaning of Inference Rule v Inference rule says: »Given the premises are true (with some substitutions for A, E1, E2) »Then, the conclusion is true (with consistent substitution) A E1 : int A E2 : int   (+) A E1 + E2 : int  E1E2 E1E2 :int +

46 - 45 - Proof Tree v Expression is well-typed if there exists a type derivation for a type judgment v Type derivation is a proof tree v Example: if A1 b : bool, x : int, then: A1 b : bool A1 !b : bool   A1 2 : int  A1 3 : int  A1 2 + 3 : int  A1 x : int  b : bool, x : int if (!b) 2 + 3 else x : int  

47 - 46 - More About Inference Rules v No premises = axiom v A goal judgment may be proved in more than one way v No need to search for rules to apply – they correspond to nodes in the AST A true : bool  A E1 : float  A E2 : float  A E1 + E2 : float  A E1 : float  A E2 : int  A E1 + E2 : float 

48 - 47 - Class Problem Given the following syntax for arithmetic expressions: t →true| false| if t then t else t| 0| succ t| pred t| iszero t And the following typing rules for the language: true : bool false : bool t1: bool t2: T t3 : T if t1 then t2 else t3 : T t1 : int succ t1 : int t1 : int pred t1 : int t1 : int iszero t1 : bool Construct a type derivations to show (1) if iszero 0 then 0 else pred 0 : int (2) pred(succ(succ(pred(0)))) : int

49 - 48 - Assignment Statements id : T  A A E : T  A id = E : T  (variable-assign) A E3 : T A E2 : int A E1 : array[T]  A E1[E2] = E3 : T  (array-assign)  

50 - 49 - If Statements A E : bool A S1 : T A S2 : T  A if (E) S1 else S2 : T  (if-then-else)  If statement as an expression: its value is the value of the clause that is executed  A E : bool A S : T  A if (E) S : unit  (if-then)  If with no else clause, no value, why??

51 - 50 - Class Problem 1.Show the inference rule for a while statement, while (E) S 2.Show the inference rule for a variable declaration with initializer, Type id = E 3.Show the inference rule for a question mark/colon operator, E1 ? S1 : S2

52 - 51 - Sequence Statements v Rule: A sequence of statements is well- typed if the first statement is well-typed, and the remaining are well-typed as well: A S1 : T1 A (S2;.... ; Sn) : Tn A (S1; S2;.... ; Sn) : Tn    (sequence)

53 - 52 - Declarations A id : T [ = E ] : T1 A, id : T (S2;.... ; Sn) : Tn A (id : T [ = E ]; S2;.... ; Sn) : Tn    (declaration) = unit if no E Declarations add entries to the environment (e.g., the symbol table)

54 - 53 - Function Calls v If expression E is a function value, it has a type T1 x T2 x... x Tn  Tr v Ti are argument types; Tr is the return type v How to type-check a function call? »E(E1,..., En) A E : T1 x T2 x... Tn  Tr A Ei : Ti (i  1... n)   A E(E1,..., En) : Tr  (function-call)

55 - 54 - Function Declarations v Consider a function declaration of the form: »Tr fun (T1 a1,..., Tn an) = E »Equivalent to:  Tr fun (T1 a1,..., Tn an) {return E;} v Type of function body S must match declared return type of function, i.e., E : Tr v But, in what type context?

56 - 55 - Add Arguments to Environment v Let A be the context surrounding the function declaration. »The function declaration:  Tr fun (T1 a1,..., Tn an) = E »Is well-formed if  A, a1 : T1,..., an : Tn E : Tr v What about recursion? »Need: fun: T1 x T2 x... x Tn  Tr  A 

57 - 56 - Class Problem Recursive function – factorial int fact(int x) = if (x == 0) 1 else x * fact(x-1); Is this well-formed?

58 - 57 - Mutual Recursion v Example »int f(int x) = g(x) + 1; »int g(int x) = f(x) – 1; v Need environment containing at least  f: int  int, g: int  int  when checking both f and g v Two-pass approach: »Scan top level of AST picking up all function signatures and creating an environment binding all global identifiers »Type-check each function individually using this global environment

59 - 58 - Static Semantics Summary v Static semantics = formal specification of type-checking rules v Concise form of static semantics: typing rules expressed as inference rules v Expression and statements are well-formed (or well-typed) if a typing derivation (proof tree) can be constructed using the inference rules

60 - 59 - Review of Semantic Analysis v Check errors not detected by lexical or syntax analysis v Scope errors »Variables not defined »Multiple declarations v Type errors »Assignment of values of different types »Invocation of functions with different number of parameters or parameters of incorrect type »Incorrect use of return statements

61 - 60 - Other Forms of Semantic Analysis v One more category that we have not discussed v Control flow errors »Must verify that a break or continue statements are always encosed by a while (or for) stmt »Java: must verify that a break X statement is enclosed by a for loop with label X »Goto labels exist in the proper function »Can easily check control-flow errors by recursively traversing the AST

62 - 61 - Where We Are... Lexical Analysis Syntax Analysis Semantic Analysis Intermediate Code Gen Source code (character stream) token stream abstract syntax tree abstract syntax tree + symbol tables, types Intermediate code regular expressions grammars static semantics


Download ppt "Semantic Analysis. - 1 - Semantic Analysis v Lexically and syntactically correct programs may still contain other errors v Lexical and syntax analyses."

Similar presentations


Ads by Google