Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Analysis and Symbol Tables

Similar presentations


Presentation on theme: "Semantic Analysis and Symbol Tables"— Presentation transcript:

1 Semantic Analysis and Symbol Tables

2 How to build symbol tables How to use them to find
Outline How to build symbol tables How to use them to find multiply-declared vars undeclared vars How to perform type checking

3 The Compiler So Far Lexical analysis Parsing Semantic analysis
Detects inputs with illegal tokens int main# (`int argc, Parsing Detects inputs with ill-formed parse trees int x = 3 x += 2; int main { ; } Semantic analysis Last “front end” phase Catches remaining errors

4 Typical semantic errors
Introduction Typical semantic errors multiple declarations: declare var. once in a scope undeclared variable: declare before use type mismatch: type of LHS of assignment should match (or be compatible with) type of RHS wrong arguments: call method with right number and types of args

5 A Simple Semantic Analyzer
Works in two phases, traversing AST For each scope in program process the declarations (var_decl, fun_decl, param) add new entries to symbol table report variables that are multiply declared process the statements find uses of undeclared variables update the “ID” nodes (like “var”) of AST to point to appropriate “decl” nodes “use” nodes will point to “decl” nodes in tree (“declPtr” in TreeNode struct) No longer need symbol table Process all statements in program again use the AST “declPtr” links to determine the type of each expression, and to find type errors

6 Symbol Table Purpose Symbol table entry keep track of declared names
names of variables, functions, methods, classes, fields Symbol table entry associates a name with various attributes, like kind of name (variable, method, class, field) type (int, float) nest level memory location (i.e., where will it be found at runtime) globals have labels locals use “offset” in TreeNode

7 Scoping In most languages, the same name can be declared multiple times if its declarations occur in different scopes, and/or its declarations involve different kinds of names MSDN on C++: an identifier (name) is a sequence of characters used to denote one of the following Object or variable name Class, structure, or union name Enumerated type name Member of a class, structure, union, or enumeration Function or class-member function typedef name Label name Macro name Macro parameter

8 Java: can use same name for
Scoping Example Java: can use same name for a class, field of class, a method of class, and a local variable of method A legal Java program: class Test { int Test = 0; void Test () { double Test = 1; } }

9 Scoping: Overloading Java and C++
can use same name for more than one method as long as the signatures are unique (signature = name + param type list) int add (int a, int b); float add (float a, float b); void add (int a);

10 Scoping: General Rules
Scope rules of a language determine which declaration of a named object corresponds to each use of the object i.e., scoping rules map uses of objects to their declarations C++ and Java use static scoping mapping from uses to declarations is made at compile time C++ uses the "most closely nested" rule use of variable “x” matches the declaration in the most closely enclosing scope such that the declaration precedes the use a deeply nested variable “x” hides “x” declared in an outer scope Java inner scopes cannot define variables defined in outer scopes

11 Each function can have several scopes
Scope Levels Each function can have several scopes one for the parameters, one for the function body, combine above (we will do this for C-) and possibly additional scopes in the function for each “for” loop each nested block

12 Scope Levels Example void f (int k) { // parameter
int k = 0; // local variable (not OK in C++ or Java) while (k) { int k = 1; // local var in loop (not OK in Java) } the outermost scope includes just the name “f”, and function “f” has three nested scopes

13 Not all languages use static scoping
Dynamic Scoping Not all languages use static scoping Lisp, APL, and Snobol use dynamic scoping Dynamic scoping A use of a variable that has no corresponding declaration in the same function corresponds to the declaration in the most-recently-called still active function

14 Dynamic Scoping: Example
What’s the output of the following code? void main() { f1(); f2(); } void f1() { int x = 10; g(); } void f2() { String x = "hello"; f3(); g(); } void f3() { double x = 30.5; } void g() { print(x); }

15 Static vs Dynamic Scoping
Generally, dynamic scoping is a bad idea can make a program difficult to understand a single use of a variable can correspond to many different declarations with different types!

16 Can a name be used before declaration?
Used before declared? Can a name be used before declaration? Java: a method or field name can be used before the definition appears not true for a variable! C++?

17 Used before declared? Example
class Test { void f() { // OK _val = 0; g(); // ERROR! x = 1; int x; } void g() {} int _val;

18 Language Assumptions Assume that our language uses static scoping
requires that all names be declared before use does not allow multiple declarations of a name in the same scope even for different kinds of names does allow the same name to be declared in multiple nested scopes but only once per scope uses the same scope for a function’s parameters and for the local variables declared at the beginning of the function

19 Our symbol table will be used to answer two questions
Symbol Table Purpose Our symbol table will be used to answer two questions Given a declaration of a name, is there already a declaration of the same name in the current scope i.e., is it multiply declared? Given a use of a name, to which declaration does it correspond, or is it undeclared?

20 Symbol Table Purpose (Cont’d)
Symbol table is only needed to answer those two questions, i.e. once all declarations have been processed to build the symbol table, and all uses have been processed to link each “var” or “call” node in the AST with the corresponding decl node, then the symbol table itself is no longer needed no more lookups based on name will be performed

21 Symbol Table Operations
Given the above assumptions, we will need Look up a name in the current scope only to check if it is multiply declared Look up a name in the current and enclosing scopes to check for a use of an undeclared name, and to link a use with the corresponding symbol table entry Insert a new name into the symbol table with its attributes Do what must be done when a new scope is entered Do what must be done when a scope is exited

22 Symbol Table Implementations
a list of tables* a table of lists For each approach, we will consider what must be done when entering and exiting a scope, when processing a declaration, and when processing a use Simplification assume each symbol-table entry includes only symbol name type nesting level of its declaration

23 Method 1: List of Hash Tables
The idea symbol table = a list (or vector) of hash tables one hash table for each currently visible scope When processing a scope S back front do push_back of a table declarations made in S declarations made in scopes that enclose S

24 Example double x; while (...) { int x, y; ... } } void g () { f (); }
void f (int a, int b) { double x; while (...) { int x, y; ... } } void g () { f (); } After processing decls inside the while loop a: int, 1 b: int, 1 x: double, 1 x: int, 2 y: int, 2 f: (int, int)  void, 0 Our implementation: vector<unordered_set<…> *> table;

25 List of Hash Tables: Operations
On scope entry increment current level number and add a new empty hashtable to back of list To process a declaration of “x” look up “x” in last table in list If it is there, then issue a "multiply declared variable" error; otherwise, add “x” to the last table in the list

26 Operations (Cont.) To process a use of “x” On scope exit
look up “x” starting in last table in list if it is not there, then look up “x” in each prior table in the list if it is not in any table then issue an "undeclared variable" error On scope exit remove the last table from the list and decrement current level number

27 Remember Function names belong in the hash table for the outermost scope not in same table as function’s variables In example above function name “f” is in the symbol table for the outermost scope name “f” is not in the same scope as parameters “a” and “b”, and variable “x” This is so that when use of name “f” in method “g” is processed, “f” is found in an enclosing scope's table

28 Running Times for Operations
Scope entry time to initialize a new, empty hash table probably proportional to the capacity of the hash table Process a declaration using hashing, constant expected time (O(1)) Process a use using hashing, worst-case time is O(depth of nesting), when every table in the list must be examined Scope exit time to remove a table from the list, which is O(1) if memory deallocation is ignored

29 Method 2: Hash Table of Lists
The idea when processing a scope S, the structure of the symbol table is x: y: z:

30 Associated with each variable is a list of symbol-table entries
Definition Just one big hash table, containing an entry for each variable for which there is some declaration in scope S or in a scope that encloses S Associated with each variable is a list of symbol-table entries First list item corresponds to the most closely enclosing declaration Other list items correspond to declarations in enclosing scopes

31 After processing declarations inside the while loop
Example void f (int a) { double x; while (...) { int x, y; ... } void g () { f (); } } After processing declarations inside the while loop f: a: x: y: int  void, 0 int, 1 int, 2 double, 1 int, 2

32 Nesting Level Information
Level-number attribute stored in each entry enables us to determine whether most closely enclosing declaration was made in the current scope or in an enclosing scope

33 Hash Table of Lists: Operations
On scope entry increment the current level number To process a declaration of “x” look up ”x” if found, get level number from first list item if that level number = current level then issue a "multiply declared variable" error otherwise, add a new item to front of list with appropriate type and current level number

34 Operations (Cont.) To process a use of “x” On scope exit look up “x”
if not found, issue an "undeclared variable" error On scope exit scan all lists in table, looking at first item on each list if that item's level number = current level number, then remove item from its list decrement current level number

35 Running times Scope entry Process a declaration Process a use
time to increment level number, O(1) Process a declaration using hashing, constant expected time (O(1)) Process a use Scope exit time proportional to the number of names in symbol table (or perhaps even size of hash table if no auxiliary information is maintained to allow iteration through non-empty hash table buckets)

36 Type Checking Purpose of type-checking phase
Determine type of each expression in program Annotate AST Find type errors Type rules of a language define how to determine expression types, and what is considered to be an error Type rules specify, for every operator (including assignment), what types the operands can have, and what is the type of the result

37 Example Both C++ and Java allow the addition of an int and a double, and the result is of type double However, C++ also allows a value of type double to be assigned to a variable of type int Java considers that an error

38 Other Type Errors The type checker must also
find type errors having to do with the context of expressions, e.g., the context of some expressions must be boolean type errors having to do with function calls Examples of context errors condition of an if statement condition of a while loop termination condition of a for loop Examples of errors calling something that is not a function calling a function with wrong number of arguments calling a function with arguments of the wrong types


Download ppt "Semantic Analysis and Symbol Tables"

Similar presentations


Ads by Google